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Automatic Synthesis of 

Implementations for Abstract Data Types 

from Algebraic Specifications 

Abstract 

Algebraic specifications have been used extensively to prove properties of abstract data types 
and to establish the correctness of implementations of data types. This thesis explores an 
automatic method of synthesizing implementations for data types from their algebraic 
specificat : ons. 

The inputs to the synthesis procedure consist of a specification for the implemented type, a 
specification for each of the implementing types, and a formal description of the 
representation scheme to be used by the implementation. The output of the procedure 
consists of an implementation for each of the operations of the implemented type in a simple 
applicative language. 

The inputs and the output of the synthesis procedure are precisely characterized. A formal 
basis for the method employed by the procedure is developed. The method is based on the 
principle of reversing the technique of proving the correctness of an implementation of a data 
type. The restrictions on the inputs, and the conditions under which the procedure 
synthesizes an implementation successfully are formally characterized. 
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1. Introduction 

1.1 Goals of the Thesis 

This thesis is concerned with the problem of automatic synthesis of implementations 
for abstract data types from their algebraic specifications. The inputs to the synthesis 
procedure include (i) a formal specification of the data type to be implemented, (ii) a formal 
specification of each of the implementing types, and (iii) a formal description of the 
representation scheme to be used by the desired implementation. The output consists of an 
implementation for each of the operations of the implemented type. The inputs are specified 
using an algebraic specification technique [14, 18, 25]. 

The thesis has three main goals: 

(1) To precisely characterize both the inputs of the synthesis procedure, and the output 

(2) To devise an automatic method of deriving the output from the inputs. 

(3) To provide a formal basis ibr the method. 

The method of derivation is described in terms of a set of synthesis rules. The 
output is derived by invoking the synthesis rules a finite number of times. The thesis 
describes how the synthesis rules are used in deriving a suitable implementation. 

The purpose of providing a formal basis for the method is to justify the correctness 
of the implementations derived by the synthesis procedure. The formal basis also helps in 
characterizing the scope of the synthesis procedure. 

1.2 Motivation for The Research 

The reliability of computer software has received a great deal of attention in recent 
years. Rapid advances in hardware technology have dramatically decreased the cost of 
hardware relative to software. As a result, the cost of producing and maintaining software has 
become a major concern. An effective way of improving the reliability and the cost of 
software simultaneously is to find methods to decrease the effort required to produce correct 
software. At present, active research is underway [43] in exploring this avenue. Several 



approaches have been proposed, each of which can be put under one of the following three 
: categories based on the degree of automation it offers: manual approaches, semi-automatic 
>. approaches, and automatic approaches. 

The manual approach advocates discipline in human programming [31, 11, 41]. It 
consists of identifying new mechanisms of abstractions [32] that encourage the advocated 
. discipline. The most significant contribution of this approach has been the inducement of a 
change in the attitude of programmers towards the style of programming. Concrete 
manifestations of this change include the birth of the concept of abstract data types, and the 
development of new languages [34, 29, 52] to support data types. 

The goal of the semi-automatic approach is to seek machine help to establish the 
correctness of programs written by the user. Formal methods are developed to specify and 
verify properties of pieces of software [13, 12, 20]; systems are built to carry out verification 
automatically or semi-automatically [27, 15]. A variant of the verification method is the 
programmer's apprentice method [19], The programmer's apprentice provides an interactive 
programming environment built up by a set of tools which helps the programmer in 
preparing and checking his work in several ways. The tools range from simple editors to 
more sophisticated ones that can analyze and criticize a user's program during the various 
phases of programming Yet another way of providing partial machine help is to build 
systems [2, 3, 48] that will help apply transformation rules chosen from a catalogue of 
equivalence preserving transformations. The programmer can refine or improve the 
efficiency of his programs by judiciously choosing the appropriate rules from the catalogue. 

The automatic approach, under which our research falls, seeks to automate a part or 
all of the programming process itself. Its goal is to generate code for programs from their 
high-level declarative descriptions, thereby relieving the programmer of having to worry 
about error-prone, low-level details of programming. Though this may one day be feasible, 
experience [1, 36] in the last few years shows that not nearly enough is known about the 
process to automate it completely. Two remedies have been used with some success to break 
the stalemate in the situation: The first is to restrict the domain for which programs are being 
synthesized [4]; the second is to expect the user to furnish more information about the desired 
properties of the program [6] to guide the synthesis procedure. 
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A third course of action that has not so far been employed in earnest is to 
complement the automatic approach with recent advances in programming methodology. 
(Bauer, etal., [3] have employed this idea with the semi-automatic approach.) In particular, 
the idea of designing software as a hierarchy of abstractions can be used to aid the synthesis 
procedure. Such a hierarchical design for the program reduces the amount of refinement 
required to be performed by the synthesizer at each step. 

The thesis takes into consideration all the factors mentioned above. Within the 
general area of programming, we restrict ourselves to the study of synthesis of 
implementations for abstract data types. We believe that the synthesis of implementations for 
abstract data types is amenable to automation because the specification techniques for data 
types have been extensively studied, and hence, are better understood. We also expect 
additional information about the implementation to be furnished by the user. This 
information is provided in the form of a description of the representation scheme to be used 
by the implementation. 

1.3 Related Work 

The works related to ours lie partly in the area of general program synthesis and 
partly in the area of automatic implementation of data structures. 

In the general area of synthesis, the work most closely related to ours is that of 
Darlington [8, 9]. He has developed a system that uses a set of transformation rules to 
improve semi-automatically the efficiency of recursive programs and also to construct new 
recursive programs. Recently, he has also applied the transformation rules to synthesize 
implementations for data types [7]. The synthesis rules developed in the thesis are closely 
related to his. The difference lies in the method in which the synthesis rules are used to 
synthesize implementations. Our method is based on verification techniques of data types. 
Our work has two advantages over his. Firstly, the class of implementations derived by our 
method is larger than his. This is because we develop more ways of using the synthesis rules 
for deriving implementations. Secondly, we formally characterize the conditions under which 
the synthesis rules yield a correct implementation for data types. 
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The ZAP system [30] of Feather s is a program transformation system in which the 
basic rules of manipulations are similar to our synthesis rules. His work is different from ours 
in two ways. Firstly, he is concerned iwith developing higher level strategies to apply the basic 
tranformation rules (in general, any equivalence preserving rules) for the construction of 
large-sized programs. Secondly, his approach is less automatic than ours. The emphasis in 
the design of ZAP is to use "metaprograms" to improve communication between the user and 
the system. There are two inputs to ZAP: the specification of the program to be constructed 
and a metaprogram which consists of a sequence of commands that direct the transformation 
process. The metaprogram expresses the higher level strategy to be used in applying the 
tranformation rules. 

Within the area of automatic implementations for data structures, the work of 
Okrent [40] has goals closest to ours. Okrent's method uses only the algebraic specifications 
of the data types involved as inputs. Because of the lack of information about the desired 
representation scheme, the implementations generated by his synthesis procedure are not as 
interesting as the ones generated by ours. He limits severely the range of the data types 
acceptable as inputs. He also concentrates on a fixed set of target structures such as 
contiguous memory and heap memory for the implementations. 

Another work in this area that is related to ours is that of Subrahmanyam's [50]. 
Subrahmanyam's method like Okrent's does not use any information about the 
representation scheme. His method has a provision for the user to specify performance 
constraints on the desired implementation. The method is based on partitioning the 
operation set of the data type into a kernel set and a nonkernel set Implementations for the 
kernel operations are derived by identifying pairs of functions (on the representation type) 
called retrievable insertion junction pairs. Implementations for the nonkernel operations are 
derived in terms of the implementations for the kernel operations so as to. meet the 
performance constraints. 

Most of the other research in the automatic generation of data structure 
implementations has been concerned with the automatic selection of an optimal 
representation for data structures. Rowe and Tonge [47], Rovner [46], and Tompa and 
Gotlieb [51] have studied optimization problems for a language containing a fixed set of high 
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i level data structures. First they build a library of possible implementations for each fixed 
high level data structure in the language, along with a parameterized description of the 
performance of each library entry. Then they proceed to select the "best" implementation for 

' each instance of the data structure, by making a flow analysis of the program that uses the 
data structure. The goal of our work is to derive an implementation for a given 
representation rather than to select an optimal one among a given set of representations. 

Standish, etal., [49], Bauer.eLal., [3], and Wile, etal, [2] have developed catalogues 
of equivalence preserving transformation rules as a part of program development systems. 
The programmer can refine or improve the efficiency of his programs by instructing the 
system to apply appropriate transformation rules on the programs. None of these works, 
however, deals expliciUy with the implementation of data types. It is possible, with some 
modifications, to incorporate our synthesis rules as a part of their system. 

1.4 Organization of the Thesis 

The next chapter gives an overview of the synthesis procedure. The third chapter 
describes in detail the inputs of the synthesis procedure, and formalizes the restrictions on the 
inputs. The synthesis procedure derives an implementation in two stages: The 
implementation is first derived in a preliminary form which is then transformed into a final 
form. The first stage of the procedure is the topic of the fourth and the fifth chapters. The 
sixth chapter describes the second stage. The last chapter gives the concluding remarks. 
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2. An Overview of the Synthesis Procedure 

This chapter gives an overview of the synthesis procedure. The first section gives a 
scenario of the synthesis procedure from a user's point of view. It briefly describes the form 
of the inputs to the synthesis procedure, and the form of its outputs via an example. The 
second section gives a summary of the synthesis procedure. It points out the nontrivial issues 
involved in the method employed by the procedure for deriving an implementation. The last 
section describes the scope of the procedure. 

2.1 The User's View 

Consider the following scenario involving a programmer. The programmer has 
designed an abstract data type (the implemented type) to be used in solving one of his 
programming problems. He is now seeking the help of a system for implementing the type 
using another data type, called the representation type; The representation type is chosen by 
the user himself. Furthermore, he is willing to furnish information about how he wants the 
values of the representation type to be used in representing the values of the implemented 
type. The system is expected to generate automatically (or with some help from the user) an 
implementation for the implemented type that uses the representation type as the 
representation in a manner consistent with that suggested by the user. 

Viewed as a black box, the inputs to the procedure are: 

(i) A specification of the implemented type, 

(ii) a specification of the representation type, and specifications of all the types used in 
the specification of the representation type. We refer to the representation type, and 
all the types its specification uses as the implementing types. 

(iii) an association specification that describes how the values of the representation type 
are to be used in representing the values of the implemented type; this corresponds 
to the representation (or abstraction) Junction defined by Hoare in {21]. 

The output of the synthesis procedure consists of an implementation for each of the 
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operations of the implemented type in terms of the operations of the implementing types. To 
get a better idea about the inputs and the output, let us consider an example of deriving an 
implementation for the data type Queuejnt in terms of CircJList. Queuejnt is a 
flrst-in-first-out queue of integers. Elements are added to a queue at the rear end, and 
removed from the front end. Circ_List is a list of integers. Elements are inserted into and 
removed from a list at the same end, which is the rear end of the list. The operation that gives 
Circ_List a circular character is Rotate. Rotate moves every element in a list by one position 
towards the rear end in a cyclic fashion, i.e., the element at the rear end is moved to the front 
end. 

In this example, the implemented type is Queuejnt and the representation type is 
Circ_List. CircJList uses (this notion is defined precisely in the next chapter) the data types 
Integer and Bool, so the implementing types include CircJList, Integer, and Bool. Figures 1, 
2, and 3 give the inputs to the synthesis procedure. (The figures also give an informal 
description of the operations of the data types.) Specifications of Integer and Bool should 
also be given as inputs, although we have not shown them here. The language used to express 
the data type specifications is equational, similar to the ones developed in {14, 18, 25]. One of 
the crucial differences is the following: We assume that the specification of every data type 
identifies a basis for the data type. A basis is a minimal set of operations of the data type that 
can be used to generate all the values of the type. The operations in the basis are called the 
generators of the type. For example, the operations Create and Insert can be the generators 
for CircJList. The specification language is described in the next chapter. 

Fig. 3 gives the association specification for the implementation to be derived. It 
characterizes the representation scheme to be used by the implementation. The association 
specification is expressed in two parts. The first part specifies the invariant 5. 5 is a predicate 
that specifies the set of values that may be used to represent the values of the implemented 
type; only those values of the representation type for which 5 is True may be used to 
represent the values of the implemented type. In the present example, 3 is True for all values 
of CircJList. The second part specifies the abstraction Junction X; X maps a value the 
representation type to the value of the implemented type that the former may represent In 
the present example X specifies the following mapping: The empty queue is represented by 
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Fig. 1. Specification of Queue_Int 

Queue_Int is Nullq, Enqueue, Front, Dequeue, Append, Size 

Defining Types 
Bool,Int 

Operations 

Nullq : -> Queuejnt 

Enqueue : Queuejnt X Int ■> Queuejnt 

Front Queuejnt -> Int U { ERROR } 

Dequeue : Queuejnt -> Queuejnt U { ERROR } 

Append : Queuejnt X Queuejnt •> Queuejnt 

Size : Queuejnt -> Int 

Comment- 

Queuejnt is a FIFO queue of integers. Nullq constructs the empty queue. Enqueue adds an element to 
a queue at the rear end. Dequeue removes the element at the front of a queue. Front returns the 
element at the front of a queue. Append joins two queues adding the elements of the second argument 
at the rear of the first argument Size computes the number of elements in a queue. 

Basis 

{ Nullq, Enqueue } 

Axioms 

(1) Front(Nullq) == ERROR 

(2) Front(Enqueue(Nullq, e)) s e 

(3) Front(Enqueue(Enqueue(q, el), e2)) = Front(Enqueue(q, el)) 

(4) Dequeue(Nullq) s ERROR 

(5) Dequeuc(Enqucue(Nullq, e)) = Nullq 

(6) Dequeue(Enqueue(Enqueue(q, el), e2)) = Enqueue(Dequeue(Enqueue(q, el)), e2) 

(10) Appcnd(q, Nullq) = q 

(11) Appcnd(ql, Enqueuc(q2, e2)) == Enqueuc(Append(ql, q2), e2) 

(12) Size(Nullq) s 

(13) Size(Enqueue(q, e)) = Sizc(q) + 1 
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Fig. 2. Specification of Circ_List 

Circ_List is Create, Insert, Value, Remove, Rotate, Empty, Join 

Defining Types 
Integer, Boolean 

Operations 

Create : •> Circ_List 

Insert : Circ_List X Integer -> Circ_List 

Value : Circ_List -> Integer U { ERROR } ' 

Remove : Circ.List -> CircJJst U { ERROR } 

Rotate : Circ_List -> Circ_List 

Empty : CircJList ■> Boolean 

Join : Circjist X Circjist -> Circjist 

Comment 

CircJJst is a list of integers with a front end and a rear end. Create constructs an empty list; the front 
and the rear ends of an empty list are the same. Insert inserts an clement into a list at the rear end. 
Value returns the element at the rear end of a list. Remove removes the element at the rear end from a 
list. Rotate moves every element in a list by one position towards the rear end in a cyclic fashion, i.e., 
the element at the rear is moved to the front. Empty checks if a list is empty. Join joins two lists by 
positioning the first argument in front of the second. 

Basis 
{Create, Insert} 

Axioms 

(I) Value(Create) = ERROR 
(2)Value(Inscrt(c,i)) = i 

(3) Remove(Create) = ERROR 

(4) Remove(Insert(c, i)) s c 

(5) Rotate(Create) =£ Create 

(6) Rotate(Insert(Create, i)) == Insert(Create, i) 

(7) Rotate(Insert(Insert(c, il), i2))) s Insert<Rotate(Insert(c, i2)), il) 

(8) Empty(Creatc) 3 true 

(9) Empty(Insert(c, i)) s false 

(10) Join(c, Create) =s c 

(II) JoiiKc, Insert(d, i)) = Insert(Join(c, d), i) 
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Fig. 3. Association Specification 

Invariant 

5(c) = True 

Abstraction Function 

^(Create) = Nullq 

.A(Insert(c 0) = add_at_hcad(.X(c), i) 

add_at_hcad(Nullq, i) = Enqueuc(NuHq, i) 
add_at_hcad(Enqueuc(q, i), il) = Enqueue(add_at_head(q, il), i) 



the empty list A nonempty queue is represented by a list whose elements are identical to the 
ones in the queue, but are arranged in the reverse order. The motivation for this 
representation scheme is that* reading and deletion of elements from a queue can be 
performed efficiently. Note that the specification of J. uses an auxiliary function 
Add_at_head on Queuejnt; the auxiliary function adds an element at the front end of a 
queue. 

Fig. 4 shows the output of the synthesis procedure. The output defines a set of 
functions, called the implementing functions, on CircJList Every implementing function 
implements an operation of Queuejnt. The implementing function implementing the 
operation f is given the name F. For instance, NULLQ implements Nullq. The target 



Fig. 4. An Implementation 
NULLQO :: = CreateO 

ENQUEUED, j) :: = Rotate(lnsort(c, })) 

FRONT(c) :: = Value(c) 

DEQUEUE(c) :: = Remove(c) 

APPEND(c, d) :: = Join(d, c) 

SIZECc) :: = if Empty(c) then 

else SIZE(Remove(c)) + 1 
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language used to express the implementations for the operations is a simple applicative 
language. The only mechanisms available in the language to build programs are: functional 
composition, conditional expressions, and recursive function definition. The language uses a 
method of defining function that is customarily used in applicative languages like pure LISP 
[37]. A function F is defined using the following schema: F(v 1 ,...,v k ) ::= e, where 
v., . . . , v k are variables, and e is an expression containing those variables. A function 
definition may use the operations of the implementing types as base functions. 

2.2 A Summary of the Synthesis Procedure 

The synthesis procedure is summarized in an illustrative fashion using the example 
already introduced. This is done in the first two subsections. In the example introduced, the 
invariant 3 is a trivial one: It is True on all values. In the third subsection, we highlight the 
issues involved in deriving an implementation in the presence of a nontrivial invariant by 
introducing a new example. 

The method used by the procedure to derive an implementation is based on treating 
every equation in the specifications as a rewrite rule. The procedure begins by combining all 
the input specifications into a rewriting system called the Initial World (IW). (IW is obtained 
by simply replacing the symbol = by -+ in the input specifications.) The procedure assumes 
that IW satisfies the uniform termination property as well as the unique termination property. 
(IW is said to be convergent in such a case. This is similar to the Church-Rosser property.) 
The uniform termination property ensures that every chain of reductions starting from an 
expression terminates. The unique termination property ensures that all chains of reductions 
starting from an expression terminate in the same expression. These two properties ensure 
that the equivalence relation characterized by a specification can be computed by using the 
rules in IW for reducing expressions. The procedure also assumes that there is a predefined 



2. A rewrite rule (written a ~* 0) is an ordered pair - a left hand side and a right hand side - of 
expressions. A rewrite rule can be used to reduce any expression that is an instance of the left hand 
side into an expression that is an instance of the right hand side. A rewriting system is a set of rewrite 
rules. 
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termination ordering (>-) on expressions which can be used for showing the uniform 
termination property of rewriting systems. 

The synthesis procedure derives the implementation in two stages. In the first stage 
the procedure derives the implementation in an intermediate form. The intermediate form is 
called a preliminary implementation. In the second stage the preliminary implementation is 
transformed into an implementation in the target language (target implementation). Fig. 5 
gives a preliminary implementation for Queuejnt that is consistent with the association 
specification given in Fig. 3. There are two crucial differences between a preliminary 
implementation and a target implementation. The first one concerns the methods used for 
defining the implementing functions. A preliminary implementation defines a function as a 
set of rewrite rules. The rewrite rules defining an implementing function F are the ones that 
have F as the outermost symbol on their left hand side. For instance, rules (2) and (3) in 
Fig. 5 define ENQUEUE. The second difference is that the only operations of the 
representation type that are permitted to appear in a preliminary implementation are its 
generators. A target implementation is permitted to use all the operations of the 
representation type. In the example under consideration, for instance, a preliminary 
implementation may use all the operations of Integer and Bool, but only the generators 



Fig. 5. A Preliminary Implementation 

(DNULLQ()-*Create() 

(2) ENQUEUE(Create, j) — lnsert(Create, j) 

(3) ENQUEUE(lnsert(c, i), j) - lnsert(ENQUEUE(c, j), i) 

(4) FRONT(Create) -♦ ERROR 

(5) FRONT(lnsert(c, 0) -H 

(6) DEQUEUE(Create) -► ERROR 

(7) DEQUEUEUnsert(ci)) -» c 

(8) APPENDCc, Create) -* c 

(9) APPENDCc, lnsert(d, 0) -> APPEND(ENQUEUE(c, i), d) 

(10)SIZE(Create)-»0 

(11) S!ZE(lnsert(c, 0) -» SIZE(c) + 1 
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■' (Create, and Insert) of Circ_List. 

There are two reasons for the decomposition. Firstly, it makes the synthesis 
procedure more modular. Target language dependent transformations are separated from the 
language independent transformations. The decomposition also lends itself naturally to 
deferring efficiency improving transformations to the later stage. In the first stage one can 
concentrate on deriving a simple correct implementation. Secondly, the decomposition 
reduces the complexity of the structure of synthesis procedure. The first stage deals with the 
techniques for deriving an implementation from the specification of the data type. The 
second stage deals with the techniques for deriving alternate forms of implementations from 
an preliminary implementation. The decomposition provides a better insight into the 
synthesis method, and simplifies the description of the synthesis procedure. The next two 
subsections give an overview of the two stages of the synthesis procedure. 

2.2.1 Stage 1: Preliminary Implementation Derivation 

A preliminary implementation of a data type is correct with respect to an abstract 
function A if the following condition holds: Every implementing function F (that implements 
the operation 1) defined by the preliminary implementation is a total function on the 
representation values so that the homomorphism property 36(F(x)) = fCJG(x)) holds. Here % 
is a function on the values of the implementing types; % behaves exactly like the abstraction 
function A on the representation values, and like an identity function on all other values. The 
synthesis procedure derives a preliminary implementation so that the above criterion of 
correctness is satisfied. 

The procedure synthesizes the preliminary implementation for one operation at a 
time by deriving a separate set of rewrite rules for every operation. Since the method used is 
the same for every operation, we illustrate the synthesis of only a couple of operations. The 
procedure first determines the left hand sides of all the rules of the preliminary 
implementation. Then, it determines a suitable right hand side for each of the rules. 
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, 2.2.1.1 Determining the Left Hand Side 

One of the correctness requirements of a preliminary implementation is that it must 

define a total function on the representation type. This requirement is ensured by deriving 

l the rules of the preliminary implementation so that (1) they satisfy the uniform termination 

property, and (2) they are well-spanned. The first property is ensured while deriving the right 

hand side of the rules. The second property is used to determine the left hand sides. 

The second property requires the left hand side expressions of the rules to be of a 
particular form. For instance, any pair of rules that have the form given below constitute a 
well-spanned set of rules for ENQUEUE. (In the following TrhSj and ?rhs 2 are used as place 
holders for expressions to be determined later.) 

ENQUEUE(Create, j) -» TrhSj 

ENQUEUE0nsert(c, i), j) - ?rhs 2 

Note that the left hand side of each of the above rules consists of ENQUEUE 
applied to arguments that are generator expressions. 3 The set of arguments, i.e., sequences of 
generator expressions, to ENQUEUE on the left hand side of the rules is 
ArgsSet = {<Create,j>,<Insert(c,i),j>}. ArgsSet spans the set of all ordered pairs of 
generator constants. In other words, every pair of generator constants is an instance of one of 
the arguments in ArgsSet. This property ensures that the definition of ENQUEUE accounts 
for all the representation values. It is easy to build a procedure that automatically generates a 
well-spanned ArgsSet, once the generators of the representation type are identified. Thus, an 
appropriate set of left hand sides for the rewrite rules to be derived can be determined 
automatically. 



3. A generator expression is an expression in which the only function symbols involved are the 
generators. A generator constant is a generator expression that does not contain any variables. 
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2.2.1.2 Determining the Right Hand Side 

The right hand sides of the ailes are determined so that the preliminary 
implementation satisfies the homomorphism property mentioned earlier. For this, the Initial 
World, IW, is first supplemented with a set of rules, called the %-rules. The Dfi-rules express 
the homomorphism property; there is an 3G_rule for every implementing function. For 
instance, the 36-rule corresponding to ENQUEUE is 

Dt(ENQUEUE(c,j))-+Enqueue(3€(c),D6G))- Let us call the supplemented system the 
Perturbed World(PVO- 4 

The Perturbed World (PW) is then used to derive a set of synthesis equations, one 
equation for every rule in the preliminary implementation. The right hand side of a rule is 
determined from the right hand side of the corresponding synthesis equation. For instance, 
the synthesis equation corresponding to the rule ENQUEUE(Insert(c,i),j)-> ?rhs 2 is an 
equation of the form %(ENQUEUE(Insert(c i), j)) = D€(?rhs 2 ) that satisfies the following 
conditions: 

(1) 3€(ENQUEUE(Insert(c, i), j)) = D€(?rhs 2 ) is a theorem of PW 

(2) ENQUEUE(lnsert(c,i),j)>-?rhs 2 

(3) ?rhs 2 contains only the permitted operations of the implementing types, and the 
implementing functions. 

The Synthesis Theorem in chapter 4 shows that, when a preliminary 
implementation is well-spanned, the preliminary implementation satisfies the 
homomorphism property if the synthesis equation corresponding to each of the rules in the 
preliminary implementation is a theorem of PW. Note that the second condition above 
ensures that the rewrite rules derived satisfy the uniform termination property. The third 
condition ensures the syntactic correctness of the preliminary implementation. 



4. Note that since % is a function that behaves essentially like -4, the rewrite rules specifying it in PW 
are obtained by simply replacing J. by % in the asociation specification. 
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2.2.1.3 Deriving the Synthesis Equations 

Every synthesis equation of the preliminary implementation is derived with the help 
of two inference rules called the synthesis rules. The synthesis rules are designed for 
generating theorems of PW that have the same left hand sides, but different right hand sides. 
For deriving a synthesis equation, the synthesis rules are invoked repeatedly a finite number 
of times to generate a series of theorems until the desired equation is generated. For instance, 
the synthesis equation corresponding to the rule ENQUEUE(Insert(c, i), j) -» ?rhs 2 is derived 
by generating a series of theorems that have D6(ENQUEUE(Insert(c, i), j)) as their left hand 
side. The generation continues until a theorem whose right hand side qualifies the theorem 
to be a synthesis equation is encountered. 

The idea used for generating an equation is to reverse the method of demonstrating 
that such an equation is a theorem of PW. The central notion used in the generation is a 
mechanism called expansion. Expansion 5 is the opposite of reduction. It is the act of 
applying a rewrite rule to an expression from right to left 

For example, consider the rule 3G(ENQUEUE(c,j))-+ Enqueue(D6(c), D€(j)), and 
the expression Add_at_head(Enqueue(3G(Create), %(\)), k). The subexpression 

Enqueue(3G(Create), 3G(0) is an instance of the right hand side of the rule for the substitution 
{ch Create, J i — ^ i >. The corresponding instance of the left hand side is 
D€(ENQUEUE(Create, i)). Therefore, Add_atJiead(Enqueue(DG(Create), 3G(i)), k) expands to 
Add_at_head(Dfi(ENQUEUE(Create, i)), k) by the rule. 

The first synthesis rule specifies a way of generating a theorem from an expression 
with that expression as the left hand side. In the following e4> denotes the normal form of e 
obtained using PW. 6 (The normal form of e is the result of reducing it using the rewrite rules 
of PW until it becomes irreducible.) 



5. The definition of expansion will be revised later in chapter 4 to make it more general. According to 
the definition given here, expansion is identical to the transformation technique folding used by 
Darlington [7J for synthesis of recursive programs. 

6. PW is a convergent system. Therefore, every expression is guaranteed to have a unique normal 
form. 
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„ , _ e is an expression 

Rule 1: rrri 

e = e4> 

The second synthesis rule specifies how to generate a theorem from an existing one 
so that the new theorem has the same left hand side as the old one. In the following 
expand(e 2 ) denotes any expression that is an expansion of e 2 using some rewrite rule of PW. 

e, = e. 



Rule 2: 



ej = expand(e 2 ) 

We investigate two methods in which the synthesis rules can be used for deriving a 
synthesis equation. The first method derives synthesis equations that are in the equational 
theory of PW. The second method derives equations that are in the inductive theory. The 
second method is more general than the first one. A system that implements the synthesis 
procedure would, therefore, use only the second method. We discuss them separately for 
pedagogic reasons. 

2.2.1.3.1 Derivation in the Equational Theory 

As an illustration, let us derive a synthesis equation that is of the form 
D€(ENQUEUE(Insert(c, i), j)) = D£(?rhs 2 ). The equation is derived by generating a series of 
theorems that have 3G(ENQUEUE(Insert(c, i), j)) as their left hand side. The generation is 
begun by invoking synthesis rule(l) on the left hand side expression. The rest of the 
theorems in the series are generated by invoking synthesis rule (2) using the rewrite rules of 
PW for expansion. The rewrite rules for expansion are chosen with the following ultimate 
goal: Obtain a right hand side that has the form D€(?rhs 2 ) so that 
Dfi(ENQUEUE(lnsert(c,i),j))>-3G(?rhs 2 ), and ?rhs 2 contains only the implementing 
functions and the permitted operations of the implementing types. In the illustration given 
below, the generation of every theorem in the series is considered as a step. At each step, the 
expression expanded, and the rewrite rule used for expansion are indicated. The relevant 
rewrite rules of PW that are going to be used for expansion are listed at the beginning. 
Rule (1) is the DG-rule coresponding to Enqueue; rules (2) through (5) are obtained from the 
association specification. 
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Relevant Rewrite Rules of the Perturbed World 

(1) DG(ENQUEUE(c, j)) -► Enqueue(D€(c), 0G(j)) 

(2) ^(Create) -» Nullq 

(3) D6(Insert(c, i)) -» Add_at_head(3€(c), Dfi(i)) 

(4) Add_at_head(Nullq, i) -► Enqueue(Nullq, i) 

(5) Add_at_head(Enqueuc(q, i), j) — ► Enqueue(Add_at_head(q, j), i) 



Form of the theorem to be generated: 5G(ENQUEUE(Insert(c, i), j)) = D€(?rhs 2 ) 
Normal form of Dt(ENQUEUE(Insert(c, i), j)): Enqueue<Add_aLhead(36(c), 3€(i)), Ofi(j)) 
Rules used for the normal form: (1), (3) 

Step (1) Invoke Synthesis Rule (1) on DG(ENQUEUE(Insert(c, i), j)) 

D£(ENQUEUE(Insert(c, i), j)) = Enqueue(Add_at_head(Dfi(c), %(i)), 3€<j)) 



Step (2) Expand Expression: Enqueuc(Add_at_hcad(Dfi(c), %(i)), %(j)) 
Using Rule: (5) 



DG(ENQUEUE(Insert(c, i), j)) = Add_at_hcad(Enqueue(D€(c), %Q)) t Dfi(i)) 

Step (3) Expand Expression: Enqueue(3fi(c), Dfi(j)) 
Using Rule: (1) 

Dt(ENQUEUE(Insert(c, i), j)) = Add_at_head(3G(ENQUEUE(c, j)), %(\)) 



Step (4) Expand Expression: Add_at_head(3€(ENQUEUE(c f j)), D€(i)) 
Using Rule: (3) 

Dt(ENQUEUE(Insert(c, i), j)) = 3G(Insert(ENQUEUE(c, j), i)) 



The theorem generated in step (4) qualifies to be a synthesis equation. Hence the desired rule of the 
preliminary implementation is: 

ENQUEUE(Insert(c, i), j) -* Insert(ENQUEUE{c, j), i) 

One can similarly generate a theorem of the form DG(ENQUEUE(Create, j)> s D6(Insert(Create,j)), 
which gives rise to the following rewrite rule to complete the preliminary implementation for 
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ENQUEUE: 

ENQUEUE(Create, j) -+ Insert(Creatc, j) 



2.2.1.3.2 Derivation in the Inductive Theory 

The method used for deriving a synthesis equation in the inductive theory is based 
on the following property that every theorem of PW satisfies: If an equation is a theorem of 
PW, then every instance of it is in the equational theory of PW. An instance of an equation 
e x = e 2 is an equation obtained by replacing every variable in e t and e 2 by generator 
constants. 

We, therefore, take the following approach for deriving an equation in the inductive 
theory. First derive an instance of the desired equation; the method of derivation described 
earlier can be used for this purpose. The instance of the equation derived should be such that 
a generalization of it has the form of the desired synthesis equation, and is a theorem of PW. 
A generalization of e t = e 2 is an equation obtained by replacing assorted constants in e t and 
e 2 by suitable variables. To check if the generalization is a theorem of PW, we use an 
automatic procedure called is-an-inductive-theorem-of. The procedure is an extension of the 
method of using the Knuth-Bendix completion algorithm for proving inductive properties of 
convergent rewriting systems [28, 38, 22]. The procedure is described in chapter 4. 

As an illustration let us derive a synthesis equation of the form 
Dt(APPEND(c, Insert(d,i ))) = Dfi(?rhs 2 ) which gives rise to one of rules in the preliminary 
implementation of Append. We begin by deriving an instance determined by the replacement 
of the variable d by the constant Create, and then apply generalization. 

Relevant Rewrite Rules of the Perturbed World 

(10) Append(q, Nullq) — » q 
(14) 36(Create) -4 Nullq 



7. A generator constant is an expression formed out of generators, and does not contain any variables. 
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(20) 3G(ENQUEUE(c, i)) -+ Enqueue(Dt(c), 3€(i)) 
(22) 3G(APPEND(c, d)) .-» Append(K(c), %{d)) 

Form of the theorem to be generated: D6(APPEND(c, Inscrt(Creatc, i))) = %>(?<;) 
Normal form of Dfi(APPEND(c,Insert(Creatc,i))): Enqucue(D6(c), %(i)) 
Rules used for the normal form: 

Step (1) Invoke Synthesis Rule (1) on %(APPEND(c, Inscrt(Create, i))) 
3t(APPEND(c, Inscrt(Crcate, i))) = Enqucuc(3€(c), %(i)) 



Step (2) Expand Expression: D€(APPEND(c, Insert(Create, i))) 

Using Rule: (10) 
.... i 

%(APPEND(c, Insert(Create, i))) he Append(Enqucue(36(c), Dt(i)), Nullq) 



Step (3) Expand Expression: Nullq 
Using Rule: (14) 

3G(APPEND(c, Insert(Creatc, i))) == Append(Enqucue(DG(c), 36(i)), Dt(Create)) 



Step (4) Expand Expression: Enqueue(!JG(c), 3G(z)) 
Using Rule: (20) 

3C(APPEND(c, Insert(Create, !))) = Append(Dfi(ENQUEUE(c t i)), 36(Create)) 



Step (5) Expand Expression: Append(3G(ENQUEUE(c,0),36(Create)) 
Using Rule: 

DG(APPEND(c, Insert(Create, i))) = 36(APPEND(ENQUEUE(c, i), Create)) 



Step (6) Generalize the theorem in step (5) by replacing the constant 
Create by the variable d to obtain the following equation: 
36(APPEND(c, Insert(d,i ))) s DG(APPEND(ENQUEUE(c i), d)) 

Apply is-an-inductive theorem-of on the above equation. 
This yields True confirming that the equation is a theorem. 
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Hence the desired rule (obtained by dropping % on both sides) is: 

APPEND(c, Insert(d,i )) -* APPEND(ENQUEUE(c, i), d) 

i 

One can similarly generate a theorem of the form D€(APPEND(Create, d)) = %(d) which gives rise to 
the following rewrite rule to complete the preliminary implementation of APPEND. 

APPEND(Creatc, d) -► d 



2.2.2 Stage2: Derivation of the Target Implementation 

In the second stage of the synthesis procedure, the preliminary implementation is 
transformed into a target implementation. It should be noted that the preliminary 
implementation is itself an executable implementation. It can be executed by an interpreter 
that is capable of simplifying algebraic expressions using the equations in the specifications of 
data types as rewrite rules. The data type verification system AFFIRM [39] provides such an 
interpreter. Given the specifications of all the implementing types, the interpreter can 
execute the preliminary implementation on any given input Our goal is to derive the target 
implementation in a fonn that can be compiled by a compiler for an applicative language. 
There are two reasons why a target implementation is more efficient than a preliminary 
implementation. The first one arises because of the freedom to use nongenerators of the 
representation type in a target implementation. This makes it possible, in some instances, to 
eliminate recursion from a preliminary implementation of an operation, and to transform into 
one which is a composition of the operations of the implementing types. The second reason 
is that an implementation that can be compiled by means of a conventional compiler is in 
general more efficient than interpreting a set of rewrite rules. We investigate two methods of 
transforming a preliminary implementation into a target implementation. We describe each 
of them briefly below. The first method, although less efficient than the second, derives a 
larger set of implementations. 
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2.2.2.1 Recursion Eliminating Method 

According to this method the problem of deriving a target implementation is viewed 
as finding a composition f* of the operations of the implementing types and the 
implementing functions (possibly including the if_then_else function) that has the same 
functional behavior as the implementing function F defined by the preliminary 
implementation. For example, the composition Rotate(Insert(d, k)) has the same behavior as 
the function ENQUEUE defined by the rewrite rules of the following preliminary 
implementation: 

ENQUEUE(Create, j) -> Insert(Create, j) 
ENQUEUE(Insert(c i), j) - Insert(ENQUEUE(c, jX i) 

So, the following can be a target implementation for it: 
ENQUEUE(d, k) ::= Rotate(Insert(d, k)). Note that the target implementation does not use 
recursion. 

More formally, the problem can be stated as follows: Find a composition 1* so that 
the equations obtained by substituting f for ENQUEUE in the rewrite rules are theorems of 
the implementing types. The equations for ENQUEUE are given below. Note that, in 
obtaining the following equations, the two sides of the rewrite rules are interchanged after 
replacing ENQUEUE by P. The need for the interchange will be explained later. 

(1) Insert(Create,j) = f*(Create,j) 

(2) Insert(f*(c,j),i) = f*(Iiisert(c,iXJ) 

We use the following strategy to find a solution for f*. We generate a theorem of 
the implementing types using one of the above equations as a template. For generating such 
a theorem we use the synthesis rules mentioned earlier. However this time, since we are 
interested in the theorems of the implementing types, the rewrite rules in the specification of 
the implementing types are used for expansion. The theorem generated determines a 
candidate for f*. The goal is to generate a theorem so that the candidate for f* determined by 
the theorem also satisfies the other equation. For instance, the sequence of steps given below 
generates a theorem that has the form of equation (1). 
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Rewrite Rules of CircJList 



(3) Rotatc(Crcate) -> Create 

(4) Rotate(Inscrt(Crcate, i)) — » Insert(Create, i) 

(5) Rotate(Inscrt(Inscrt(c, il), i2)) -+ Insert(Rotate(Inscrt(c, i2)), il) 



Form of the theorem to be generated: lnsert(Create, j) = f*(Create, j) 
Normal form of Insert(Create, j) : Insert(Create, j) 
Rules used for the normal form: None 

Step (1) Invoke Synthesis Rule (1) on Inscrt(Crcate, j) 
Insert(Crcate, j) = Insert(Create, j) 



Step (2) Expand Expression: Insert(Create, j) 
Using Rule: (4) 

Insert(Crcate, j) = Rotatc(Insert(Create, j) 

The last theorem generated in the above series suggests that Rotatc(Insert(d, k)) is a 
candidate for f*(d, k). The candidate composition can be determined mechanically by 
comparing the theorem generated with the template equation. The candidate we currently 
have is such that the equation Rotate(lnsert(1nsert(c, i), j)) = lnsert(Rotate(lnsert(c,j)), i), 
which is obtained by replacing f by Rotate » Insert in equation (2), is a theorem of Circ_List. 
Had the candidate obtained in the last step not satisfied equation (2), the theorem generation 
would have continued further to generate another theorem that had the form of equation (1). 
The reason that the first equation, rather than the second, was used as the template 
equation is the following The synthesis rules are formulated so that the unknown expression 
in the equation to be searched for is on the right hand side. In equation (2) both sides are 
unknown since f* occurs on both the sides. That is not the case with equation (1). This was 
also the reason for interchanging the two sides of the rewrite rules while obtaining the 
template equations. In the example illustrated the theorem desired was in the equational 
theory. In general, we need to use the generalization technique described earlier since the 
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■) theorem may be in the inductive thedry. 

2.2.2.2 The Recursion Preserving Method 

In this method the target implementation is derived with the help of a special set of 
functions, called the inverting Junctions, on the representation type. To understand what 
inverting functions are, and why there are useful, let us consider an example. The 
preliminary implementation of SIZE consists of the following rules: 

SIZE(Create) -> 
SIZE(Iosert(c, i)) - SIZE(c) + 1 

A target implementation for SIZE may take the following form: 

SIZE(d) : : = if Empty(d) then 

else SlZE(Remove(d)) + 1 

Note that in the preliminary implementation the argument to SIZE on the left hand 
side of a rule is permitted to be a generator expression. The argument indicates the pattern or 
the structure of the expression that constructs the values for which the rewrite rule is 
applicable. This freedom.is used in a preliminary implementation to perform a case analysis 
based on the structure of the argument, and to decompose the argument 

In a target implementation the argument to SIZE on the left hand side of the 
definition must be a variable. This means that the expression on the right hand side of the 
definition must have explicit subexpressions for determining the structure of the argument, 
and to decompose the argument Inverting fiinctions of a data type can be used to build these 
subexpressions. 

Informally speaking, the inverting functions of a data type are functions that can be 



8. Inverting functions are closely related to distinguished junctions of a data type defined in [24]. In 
[24] the distinguished functions are used to formalize die expressive power of a data type. 

9. If we are interested in interpreting the preliminary implementation, it is, therefore, necessary for 
the interpreter to have pattern matching capability to invoke the appropriate rewrite rule while 
simplifying an expression. 
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used to algorithmically invert the process of constructing a value of the type from the 
generators of the type. In other words, by applying one or more of the inverting functions a 
finite number of times on a value one can determine a generator expression that constructs 
the value. For instance, for Circ_List the operations Rotate, Value, and Empty can serve as a 
set of inverting functions. The structure of any circular list value in terms of Create and 
Insert can be determined using these operations. For instance, if v is a variable denoting the 
value constructed by Insert(c, j), then Remove(v) extracts the component c; ~Empty(v) checks 
if v is constructed by an expression of the form Insert(cJ). So, the rewrite rules can be 
merged into the following conditional expressions: 

if Empty(d) then else SIZE(Remove(d))+ 1. 

The target implementation is derived in two steps. The first step identifies a set of 
inverting functions for the representation type. In the second step the rewrite rules 
constituting the preliminary implementation of every operation are transformed into a target 
implementation in terms of the inverting functions. The method is described in detail in 
chapter 6. 

12.3 Extending the Synthesis Procedure 

Consider the association specification given in Fig. 6. It specifies a representation 
scheme for implementing Queue Jnt as a triple Array JntX Integer X Integer, which can 
informally be described as follows. (Array Jnt is specified in the next chapter which also 
describes the association specification shown below in more detail) Nullq can be represented 

Fig. 6. Queuejnt in terms of Triple 

^(<v, i, f>) = Nullq 

Ji< Assign(v, e, j), i, j + 1>) s if i = j + 1 then Nullq 

else Enqueue( J.(<v, i, j>), e) 

5(<v, i, f>) se True 

3(<Assign(v, e, j), i, j+ 1>) s if i = j+1 then True 

else ifi<j+l then 3(<v,i,j>) 

else Fake 
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by any triple in which the two integer components are equal. A nonempty queue can be 
represented by a triple <v, i, j>, where v is an array of arbitrary length containing the elements 
of the queue between the index values i and j-1, in order. In other words, i points to the front 
end of the queue, and j points to the next position available in the queue for adding an 
element Note that in this example, unlike the last one, not every value of the representation 
type can legally represent a queue. A triple <v, i, j> is a legal representation value if only if 
i < j, and v is guaranteed to be defined on all index values between i and j-1. The invariant 5 
in Fig. 6 specifies this condition. 

The synthesis the presence of a nontrivial invariant 5 has to be performed differently 
because the implementation must be such that every implementing function F defined 
preserves 5: That is, (V v)[3(v) => 5(F(v))]. 

The synthesis procedure for such a situation is similar to the one described earlier 
except for the method employed in determining the right hand sides of the rules of a 
preliminary implementation. The difference lies in the set of rewrite rules used for expansion 
while generating the theorems. Earlier, the rewrite rules of PW were used, but now it is 
necessary to use an additional set of rewrite rules. The additional rewrite rules describe 
information pertaining to the invariant J, and the assumption that the arguments to the 
implementing function satisfy the invariant The information pertaining to 3 is maintained as 
a separate entity called the Temporary World. Chapter 5 describes how the Temporary World 
is constructed, maintained, and used in the synthesis of an implemenation. 

2.3 The Scope of the Synthesis Procedure 

The scope of the synthesis procedure is limited because of two reasons. Firstly, the 
restrictions imposed on the input specifications limit the range of data type specifications that 
are acceptable as inputs to the procedure. Secondly, the synthesis procedure is capable of 
deriving only a class of implementations that satisfy certain properties. We describe the two 
forms of limitations below. 
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2.3.1 Restrictions on the Inputs 

The input specifications must be such that the Initial World (IW), which is a 
combination of all the specifications, 'forms a rewriting system that 

(1) has the uniform termination property, 

(2) has the unique termination property, and 

(3) is well-spanned. 

The second and the third properties are not restrictive because they can be attained 
by adding certain additional rewrite rules to the system. There are automatic procedures [28, 
38, 22] for determining the rules that need to be added, provided the system satisfies the 
uniform termination property. 

The uniform termination property can be restrictive. It is, in general, not possible to 
express all the properties one wishes to specify in a manner that preserves the uniform 
termination property. For example, consider the data type Set_of_EIements that has an 
operation Insert to insert an element into a set To express the property that the order of 
insertion of elements into a set is immaterial, it is necessary to have a rewrite rule of the form 
Insert(lnsert(s, i), j) -+ lnscrt(lnscrt(s, j), i) as a part of IW. A system containing this kind of 
rule need not, in general, terminate because the rule does not strictly reduce an expression. 

One way of getting around this problem is to exclude the concerned rule(s) from 
IW. However, there are two reasons why one may not want to do this. Firstly, the rule might 
be needed to attain the second and the third properties mentioned above. In such a situation 
excluding the rule(s) makes the input unacceptable. The second reason is that omitting the 
rule may leave the specification incomplete. 10 The method used by the synthesis procedure 
does not require the specifications to be complete, so the input (excluding the concerned rule) 
in this case is acceptable. But the procedure will not be able to derive an implementation that 
is dependent on the property expressed by the rule. 



10. We use the following notion of completeness: A specification is complete if all the properties that 
are valid for the data type are provable from the specification. 
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2.3.2 The Class of Implementations Derived 

There are three factors that are responsible for limiting the class of implementations 
derived by the procedure. The first is related to the subset of the proof theory of the input 
specifications in which the synthesis procedure operates. The procedure can only derive 
those implementations whose correctness proof is within the operational part of the theory. 
The operational part of the theory comprises the subset of the inductive theory that is decided 
by the Musser/Knuth-Bendix method [38] of proving inductive properties. 

The second limiting factor is the termination ordering >-. The synthesis procedure 
assumes that an effective ordering is implicitly available to be used in ensuring the 
termination of the implementation. So, the procedure can only derive those implementations 
whose termination can be proved using the ordering >-. The more general 11 the ordering >-, 
the larger is the class of implementations that can be derived. 

The third reason is that the implementations derived may not involve arbitrary 
helping functions. The synthesis procedure is not capable of automatically discovering a 
helping function that might be necessary in an implementation. The user has to furnish a 
specification of the helping function as a part of the Initial World if he wishes an 
implementation in terms of the helping function. 

2.3.3 Effects of Using the Procedure Outside its Scope 

Using the procedure on a specification that does not satisfy the uniform termination 
property may result in infinite looping. This is because, under such a circumstance, there can 
be expressions for which a normal form does not exist The effect of a violation of the unique 
termination property depends on how serious the violation is. If the violation implies that the 
system is inconsistent, then the procedure may derive an incorrect implementation. However, 
if the system is consistent despite the violation, the effect will only be a reduction in the class 
of implementations that the procedure can derive. It should be noted that all three of the 



11. An ordering >-j is considered to be more general [23J than >- 2 if >- x contains >- 2 - That is, >~ l 
relates a larger set of expressions than >- 2 - 
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properties required of the inputs can be checked automatically (assuming that a termination 
ordering >- is available). 
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3. Inputs to the Synthesis Procedure 

This chapter has four sections. The first section defines data types and their 
specification. The second section describes the association specification. The third section 
characterizes the restrictions on the inputs. The last section describes proving properties of 
data types from the specifications. 

3.1 Data Types and their Specification 

3.1.1 Preliminary Concepts 

A data type consists of a set (perhaps infinite) of values, called the value set, and a 
finite set of operations, called the operation set. The only way in which the values of a data 
type can be constructed, manipulated or observed is through the operations of the data type. 

The behavior of a data type is usually dependent on several other data types. These 
data types appear as a part of the domain or as the range of the operations of the data type 
under consideration. We call these other data types the defining types; the data type under 
consideration is referred to as the type of interest (TOI). If the TOI is the one that is being 
implemented, we refer to it as the implemented type. The type that is used to represent the 
implemented type is called the representation type. The defining types of the representation 
type are called the ancillary types. The union of the representation type and the ancillary 
types is called the set of implementing types. For example, the defining types of the data type 
Queue Jnt specified in Fig. 7 are Integer and Bool. 

A data type has two kinds of operations. A constructor is an operation that yields a 
value of the TOI, and an observer is an operation that yields a value of a defining type. For 
Queue Jnt, the operations Nulkj, Enqueue, Dequeue, and Append are all constructors; the rest 
of the operations are observers. 

We treat the exceptional behavior of a data type in a simplified fashion. We assume 
that every data type has a unique exceptional value that is constructed by the operation Error 
belonging to the type. The value Error( ) is treated like any olher value of the type except 
that it has the following unique property. Every operation is assumed to be strict with respect 
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v> to Error( ): Every operation f is such that when applied to Error( ) from any of its domain 
types it yields the exceptional value of the range type of f. We assume that every operation f 
is a total function: That is, f is defined on every element of its domain yielding either an 
exceptional value or a normal value from its range type. 

The requirement on a data type that its values be manipulated only by its operations 

i translates to requiring that its values be constructed only by its constructors, possibly using 

^ the values of its defining types. Furthermore, in a computer the values can be constructed 

only by a finite sequence of operations, so the value set of a data type is the smallest set closed 

under finitely many applications of its constructors. This property of a data type is called the 

minimality property [25]. 

A subset of constructors is said to be complete if every value of the TOI can be 
constructed by some composition of the constructors in the subset (possibly using values of 
the defining types). A basis for a data type is a complete set of constructors that is minimal, 
i.e., no subset of a basis is complete. A data type may have more than one basis. { Nullq, 
Enqueue} is a basis for Queuejnt since all queues can be generated using Nullq and 
Enqueue, and no subset of it can do so. 

An expression (or a term) is a sequence of operations and variables denoting an 
application of the operations to the variables. The type of an expression is the range type of 
the operation symbol that appears at the outermost level of the expression. A constant is an 
expression that does not contain any variables. For example, Dequeue(Enqueue(q, e)) is an 
expression of type Queuejnt; it is not a constant since it contains variables. 
Dequeue(Enqueue(Nullq, 0» is a constant of type Queuejnt 

3.1.2 Definition of a Data Type 

The only way in which the values of a data type can be manipulated is through the 
operations of the type. We define a data type so as to capture the behavior of the type as 
viewed through the operations of the type. This behavior is called the observable behavior of 
the data type. This method of definition was advocated by Guttag [16], and later developed 
by Kapur [25]. According to this view, the values of a data type are distinguishable only by 
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means of the operations of the type. * 

Heterogeneous algebras provide a natural means of modeling the behavior of a data 
type. A heterogeneous algebra that can be used to model a data type is defined recursively in 
terms of the algebra that is used to model each of its defining types. The basis of this 
recursion is the type Bool which does not have any defining types. 

A heterogeneous algebra for a data type D, consists of (i) a domain corresponding to 
D, which is called the principal domain, (ii) a domain corresponding to every defining type of 
D, (iii) a function corresponding to every operation of D. The elements of the principal 
domain are used to denote the values of D. The minimality property of a data type requires 
that every eleme'nt of the domains of the algebra be constructible by a finite number of 
applications of the constructors of the appropriate type. Any heterogeneous algebra that has 
the appropriate signature, and that exhibits the desired observable behavior can be used to 
model the data type. Hence, we define a data type as a set of heterogeneous algebras that 
exhibit the same observable behavior. Every algebra in the set is said to be a model of the 
data type. The elements of the principal domain are called the values (of D) in that model 
Below we formally characterize the observable behavior of a heterogeneous algebra. 

The observable behavior of a model is characterized in terms of the 
distinguishability relation on the values of the model. The distinguishability relation is 
defined inductively in terms of the distinguishability of the values of the defining types. That 
is, we assume that the distinguishability relation is already defined the domain corresponding 
to each of the defining types. (The basis of this induction is the data type Bool that does not 
have any defining types; the only two values, True and False of Bool are assumed to be 
distinguishable.) Two values of a model are distinguishable if and only if there is a sequence 
of operations of D with an observer as the outermost operation, that produces distinguishable 
results when applied separately on the values. If two values are not distinguishable, they are 
observably equivalent. For instance, the Queuejnt values constructed by Enqueue(Nullq, 0) 
and Append(Nullq, Enqueue(NuHq, 0» are observably equivalent; but the ones constructed by 
Enqueue(NiiUq, 0) and Dequeue(Enqueue(Nullq, 0)) are distinguishable. Observable 
equivalence is an equivalence relation. 
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Definition Two models are behaviorally equivalent if their quotient models induced by the 
observable equivalence relations are isomorphic to each other. 

Definition A data type is a set of behaviorally equivalent heterogeneous algebras. 

3.1.3 Specification of a Data Type 

The specification of a data type is a piece of text in a formal language. It describes a 
set of properties concerning the operations of the data type. The aim of writing a 
specification is to characterize through the specification the observable equivalence relation 
that defines the data type. 

It has been observed [17] that the construction of an algebraic specification for a 
data type is rendered easier and more reliable (in the sense that one has increased confidence 
in the consistency and completeness of the specification) by using a basis of the data type as a 
guide for constructing the specification. We assume that all our specifications are constructed 
in this fashion. The operations belonging to the basis of a specification are called the 
generators of the specification. An operation that is not in the basis is called a non-generator. 
Note that all generators are constructors; non-generators may be constructors or observers. 

Throughout the development when we refer to the basis or the generators of a data 
type involved in the synthesis, we actually mean the basis or the generators associated with 
the specification of the data type being used as an input to the synthesis procedure. 
Definition of a couple of new terms pertaining to the generators are in order at this point A 
generator expression {generator constant) of a data type is an expression (constant) that 
consists of only the generators of the type. Taking Queue Jnt with the specification given in 
Fig. 7 as an example: Enqueue(NulIq, 0) is a generator constant whereas, 
Dequeue(£nqueue(Nullq, 0)) is not a generator constant, because Dequeue is a non-generator. 
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-.» 3.1.3.1 The Specification Language 

The specification language we use is a restricted version of an equational language 
: that permits conditionals and auxiliary functions. The language is similar to the ones used in 
several other works on data type specification and verification such as [14, 18, 25]. A 
specification has two parts: the Operations part describes the functionality of every operation 
of the TOI; we assume that the Operations part identifies the basis used for the specification. 
i The Axioms part consists of a set of axioms describing the properties of the operations. Every 
axiom has the form of an equation ej = e 2 , where e t and e 2 are expressions of the same type. 
The expressions may involve any of the operations of the TOI and the defining types. The 
expressions may contain any of a finite number of auxiliary functions which are also specified 
as part of the specification. The equations may involve conditional expressions on their right 
hand side, i.e., e 2 may contain the auxiliary function if_then_ebe which behaves like a 
conditional expression. 12 For the sake of clarity, we use the following more conventional 
syntax for an expression involving if_then_ekse. The expression if_then_else(b, e 21 , e 22 ) is 
written as if b then e 21 else e 22 . 

We differ from the works cited above by assuming that every axiom in the 
specification satisfies the following syntactic constraints. The constraints are not restrictive, in 
the sense that they do not restrict the class of data types that can be specified. The first 
constraint enables us to automatically partition the axiom set into two disjoint sets: One that 
contains only the generator symbols; the other whose axioms may involve generators as well 
as nongenerators. The partitioning of the axiom set facilitates the synthesis process by 
reducing the inter-dependence of the synthesis of different operations. The second constraint 
permits the axioms to be treated as left to right rewrite rules (to be described later) without 
having to interchange the two sides of the axioms. 



12. if_then_else can be specified by the following two equations. 
if_then_ebe : Bool X T XT -> T 

iLthen_e!se(True, e r e 2 ) s ej 
if_then_ebe(False, e r e 2 ) s ej 
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Every axiom e l = e 2 of a specification satisfies the following conditions: 

(1) Every data type specification explicitly identifies a basis, i.e., a set of generators. 

(2) The set of variables in e 2 is a subset of the set of variables in e r 

Figures 7 and 8 show specifications of a (FIFO) queue of integers (Queue Jnt) and a circular 
list of integers (Circ_List). The specifications meet the constraints specified above. 

3.1.3.2 Semantics of a Specification 

The specification of a data type characterizes the observable equivalence relation 
that defines the data type. The semantics of a specification is a set of heterogeneous algebras 
that are behaviorally equivalent based on the observable equivalence relation characterized 
by the specification. 

To determine the observable equivalence relation characterized by a specification, 
the symbol '=' in the axioms of the specification should be read as 'observably equivalent'. 
For instance, the equation Sizt(Enqueuc(q, c)) = Size(q) + 1 in the specification of 
Queue_Int asserts that the two expressions yield observably equivalent values for all 
instantiations of the variables in them. The -observable equivalence relation characterized by 
the specification is the reflexive, symmetric, transitive closure of s. Every algebra that 
satisfies all the axioms in the specification is a model of the type being specified by 
specification. 

3.2 Association Specification 

In addition to the specifications of the types involved in the synthesis, the synthesis 
procedure expects the user to provide information about the representation scheme to be 
used by the implementation that is to be derived. This section explains what exactly that 
information is, and how it can be specified. We call the formal description of the information 
the association specification of an implementation. 
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Fig. 7. Specification of Queuejnt 

Queuejnt is Nullq, Enqueue, Front, Dequeue, Append, Size 

Defining Types 
Bool,Int 

Operations 

Nullq : ■> Queuejnt 

Enqueue : Qucuejnt X Int -> Queuejnt 

Front . Qucuejnt -> Int U { ERROR } 

Dequeue : Queuejnt -> Queuejnt U { ERROR } 

Append : Queuejnt X Queuejnt *> Queuejnt 

Size : Qucuejnt -> Int 

Basis 

{ Nullq, Enqueue } 

Axioms 

(1) Front(Nullq) = ERROR 

(2) Front(Enqucue(Nullq, e)) = e 

(3) Front(Enqucuc(Enqueue(q, el), e2» = Front(Enqueue(q, cl)) 

(4) Dcqueue(Nullq) = ERROR 

(5) Dcqueue<Enqucuc(Nullq, e)) s Nullq 

(6) Dcqueue(Enqueue(Enqueue(q, el), e2)) == Enqucue(Dequeue(Enqueue(q, el)), e2) 

(10) Append(q, Nullq) = q 

(11) Append(ql, Enqueue(q2, e2)) = Enqueue(Append(ql, q2X e2) 

(12) Size(NuUq) = 

(13) Size(Enqueuc(q, e)) = Size(q) + 1 



Fig. 8. Specification of Ore JList 

Circ JList is Create, Insert, Value, Remove, Rotate, Empty, Join 

Defining Types 
Integer, Boolean 

Operations 
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i Create : *> Circ_List 

Insert : Circ_List X Integer •> Circ_List 

Value : Circ_List ■> Integer U { ERROR } 

Remove : Circ_List -> Circ.List U { ERROR } 

Rotate : Circ_List ■> Circ_List 

Empty : Circ_List ■> Boolean 

' Join : Circjist X Grcjist -> Circjist 

Comment 
: Circ_List is a list of integers with a front end and a rear end. Create constructs an empty list; the front 
and the rear ends of an empty list are the same. Insert inserts an element into a list at the rear end. 
Value returns the element at the rear end of a list Remove removes the element at the rear end from a 
list Rotate moves every element in a list by one position towards the rear end in a cyclic fashion, i.e., 
the element at the rear is moved to the front. Empty checks if a list is empty. Join joins two lists by 
positioning the first argument in front of the second. 

Basis 
{Create, Insert} 

Axioms 

(1) Value(Create) = ERROR 

(2) Value(Insert(c, i)) = i 

(3) Rcmove(Crcate) a ERROR 

(4) Removc(Insert(c, 0)sc 

(5) Rotate(Create) = Create 

(6) Rotatc(Insert(Creatc, Q) = Insert(Create, i) 

(7) Rotate(Insert(Insert(c, il), i2))) == Insert(Rotatc(Insert(c, i2)), il) 

(8) Empty(Creatc) = true 

(9) Empty(Insert(c, i)) = false 

(10) Join(c, Create) = c 

(11) Join(c, Insert(d, i)) s Insert(Join(c, d), i) 
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*3.2.1 What is an Association Specification ? 

An association specification characterizes two pieces of information about a 
representation scheme: 

(1) The set of values of the representation type that an implementation may use in 

representing the values of the implemented type. We call this set the representing 
domain (9b). % is characterized by means of a predicate on the representation type 
called the invariant ii): *R> is the set of values of the representation type for which 5 
is True. 

« 

(2) A function, called the abstraction function, from the values of the representation type 

to the values of the implemented type. The function corresponds to the 
representation function of a data type introduced by [21]. The abstraction function 
maps a representation value to an abstract value that the former may represent in an 
implementation. An abstraction function may be a many-to-one function. An 
abstraction does not have to be defined on every value of the representation type. 
However, it has to be defined on every value in the representing domain. 

The information characterized by the association specification is often the most 
creative part of an implementation. The proof of correctness of an implementation also, in 
general, needs to use information such as this. If the invariant part of an association 
specification is vacuous, then we assume that the invariant is true on all values of the 
representation type. In such a case the representing domain includes all the values of the 
representation type. 

3.2.2 How Is It Expressed ? 

We specify 5 and J. using the same language that is used to specify the data types 
involved. 3 is specified as a set of equations, like any other predicate on the value set of the 
representation type. J. is specified as a set of equations relating expressions of the 
representation type to expressions of the implemented type. We require that X be specified 
as a well-defined function with a nonempty domain. 
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Fig. 9. Two Association Specifications for Queuejnt 

9(a) Queuejnt in terms o/Circ_List 

jt(Create) = Nullq 

J.(lnsert(c, i)) = add_at_head(.X(c), i) 

add_atJicad(Nullq) = Enqueuc(NulIq, i) 
add_at_hcad(Enqueue(q, i), il) = Enqueue(add_at_head(q, il), i) 

9(b) Queuejnt in terms of Array Jnt X Int X Int 

J.(<v, i, i>) = Nullq 

^.(<Assign(v, e, j), i, j+ 1>) = if i = j+ 1 then Nullq 

else Enqueuc(jt(<v, i, j>), e) 

3(<v, i, i>) == True 

5(< Assign(v, e, i), i, j + 1>) = if i = j + 1 then True 

else if j+ 1< i then False 

else 5(<v, i, j>) 



Fig. 9 gives a couple of example of an association specification. 9(a) specifies an 
implementation of Queuejnt in terms of CircJJst. The empty queue is represented by the 
empty list; a nonempty queue is represented by a list whose elements are identical to the ones 
in the queue, but are arranged in the reverse order. The motivation for this representation 
scheme is that reading and deletion of elements from a queue can be performed efficiendy. 

Consider the association specification given in Fig. 6. It specifies a representation 
scheme for implementing Queuejnt as a triple , which can informally be described as 
follows. (Arrayjnt is specified in the next chapter which also describes the association 
specification shown below in more detail.) 

Fig. 9(b) specifies an implementation in which a queue is implemented as a triple 
Arrayjnt X Integer X Integer. (Arrayjnt is specified in Fig. 10.) The representation scheme 
can be informally described as follows. Nullq can be represented by any triple in which the 
two integer components are equal. A nonempty queue can be represented by a triple <v, i, j>, 
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t; Fig. 10. Specification of Arrayjnt > 
Array Jnt is Nullarr, Assign, Read, Size, Empty 

. Defining Types 
Integer, Boolean 

Operations 

Nullarr : ■> Arrayjnt 

Assign : Arrayjnt X Integer X Integer ■> Arrayjnt 

Read : Array Jnt X Integer ■> Integer U { ERROR } 

Size : Arrayjnt ■> Integer 

Empty : Arrayjnt -> Boolean 

Comment 

Arrayjnt is an array of integers. Every element in the array is indexed by an integer; the indices are 
not necessarily contiguous. Nullarr creates an empty array. Assign assigns a given value (the second 
argument) to the element at a given index (the third argument); if the array does not have an element 
with the given index, then the value is added to the array. Read reads the element at the given index. 
Empty checks if an array is empty. 

Basis 
{Nullarr, Assign} 

Axioms 

(1) Assign(Assign(v, el, il), e2, i2) = if il = i2 then Assign(v, e2, i2) 

else Assign(Assign(v, e2, i2), el, il) 

(2) Read(Nullarr, i) = ERROR 

(3) Read(Assign(v, c, j), i) = if i = j then e 

else Read(v, i) 

(4) Empty(Nullarr) = true 

(5) Empty(Assign(v, e, i» s false 
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where v is an array of arbitrary length containing the elements of the queue between the 
index values i and j-1, in order. In other words, i points to the front end of the queue, and j 
points to the next position available in the queue for adding an element 

Note that in this example, unlike the last one, not every value of the representation 
type can legally represent a queue. A triple <v, i, j> is a legal representation value if only if 
i < j, and v is guaranteed to be defined on all index values between i and j-1. The invariant 3 
in specifies this condition. 

The abstraction function J. is specified so that it is defined on all values for which 3 
is True. The specification uses an auxiliary function Add_at_head. Add_at_head is a function 
on Queuejnt that adds a given element at the front of a queue. A specification^ of 
Add_at_head is given as a part of the association specification. 

3.2.3 Further Discussion on Association Specification 

It is important to note that every association specification need not have an 
implementation corresponding to it To understand this more clearly, let us look at the 
relationship between an association specification and an implementation that uses a 
representation scheme consistent with the one characterized by the association specification. 

An implementation of a data type consists of 

(i) a representation type being used as the representation for the implementation. 

(ii) a program, ie., a segment of code, for every operation of the type in a language; this 
program is called the implementation of the corresponding operation. 

Note that both a preliminary implementation and a target implementation (as introduced in 
the previous chapter) of a data type are implementations of the data type. A preliminary 
implementation uses one language to express the program, while the target implementation 
uses another. 

Formally, an implementation of a data type can be considered to be denoting a 
heterogeneous algebra, called an implementation algebra, with 
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(i) a principal domain that is a Subset of the value set of the representation type, 

(ii) a domain corresponding to every defining type of the implemented type - this 
domain is identical to the value set of the corresponding defining type, 

(iii) a function corresponding to the implementation of every operation of the 
implemented type so that the function mimics the behavior of the implementing 
program. 

An implementation of a type is correct if there exists a homomorphism, from the 
implementation algebra to to the implemented type. The association specification should be 
such that there exists an implementation algebra with computable functions that corresponds 
to the representation scheme characterized by the association specification. More specifically, 
the implementation algebra should satisfy the following conditions: 

(i) The principal domain of the algebra is the representing domain characterized by the 
association specification. 

(ii) There is a computable function in the algebra with the appropriate functionality 
corresponding to every operation of the implemented type. 

(iii) The implemented data type is a homomorphic image of the implementation algebra 
with respect to the abstraction function. 

We do not intend to formally characterize the properties that the association specification 
ought to satisfy so that it meets the above requirement Rather, we trust the intuition of the 
user, and assume that there exists an implementation that is consistent with the association 
specification furnished by him. If the association specification provided as an input to the 
synthesis procedure is such that there is no implementation corresponding to it, then the 
synthesis procedure will, in general, never terminate. The synthesis method, however, does 
not produce an incorrect implementation in such a case. 
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3.3 Restrictions on the Inputs < 

The method used by the synthesis procedure to derive an implementation is based 
on treating every equation in the specifications as a rewrite rule. The procedure combines all 
the input specifications, and treats the union as a set of rewrite rules called the Initial World. 
The restrictions imposed on the inputs are intended to ensure that the Initial World satisfies a 
useful property called the principle of definition. 

The first subsection informally introduces the basic concepts about rewrite rules. 
(See Appendix I for formal definitions.) The second subsection defines principle of 
definition, and develops a sufficient set of conditions for principle of definition (SCPD). The 
input is expected to satisfy SCPD. The third subsection describes how to prove properties 
from a specification that satisfies SCPD. 

3.3.1 Rewrite Rules and Rewriting Systems 

A rewrite rule is an ordered pair (left, right), written left -+ right, where left and 
right are expressions containing variables so that the variables in right are among the 
variables in left. A rule is used to reduce an expression by replacing any subexpression that is 
matched by left with a corresponding version of right, i.e., with the same substitutions for 
variables that were made in matching left. (More precise definitions are given in Appendix I.) 

For example, consider the rule 

Append(q,, Enqueue(q 2 , i 2 » -* EnqueuefAppendfap q 2 ), ij), and the expression 
a = Dequeue(Append(q 3 , Enqueue(Nullq, 0))). a is reducible using the rule because it has a 
subexpression a ' = Append(q 3 , Enqueue(Nullq, 0)) that has the form of the left hand side of 
the rule: That is, Append(q r Enqueue(q 2 , i 2 )> becomes identical to 
Append(q 3 , Enqueue(Nullq, 0» when the variables in the former are substituted according to 
the substitution a = |q x •-+ q y q 2 >-+ Nullq, Lj >-* 0J. The corresponding instance of the right 
hand side of the rule (obtained by substituting the variables in Enqueue(Append(q 1 , q 2 ), y 
using the substitution a) is 0' = Enqueue(Append(q 3 , Nullq), 0). 

P = Dequeue(Enqueue(Appeiid(q v Nullq), 0)) is the expression obtained by replacing a' by 
P * in a. Then, we say that a reduces to fi, written o -» fi. 
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A rewriting system is a set of rewrite rules. Let R be a rewriting system. An 
expression a is reducible by R if it is reducible by some rale in R. If a is not reducible by any 
rule in R, then a is irreducible by R. 

If a -* p by a rale in R, then we say that a directly reduces to fi using R, and once 
again write it as a -* p (using R). Let ->* be the smallest relation on pairs of expressions 
which is the reflexive, transitive closure of -►. Thus, a -»* p if and only if there exist 
expressions a Q ,a v . . . , a n , where n > 0, such that a = a Q , a. -♦ a. +1 for i = 0, . . . , n-1 and 
a = p. We read a -»* p as a reduces to p. 

Suppose a -** p, and p is irreducible. Then we say that a simplifies top; pis called 
a normal form of o (in R). 

Rewriting systems are used to simplify expressions into their normal forms. Thus, a 
useful property of a system is uniform termination: R has the uniform termination property if 
no infinite sequence of reductions, « -+ aj -► ..., is possible in R. When R has the uniform 
termination property every expression is guaranteed to have a normal form. Another useful 
property of a rewriting system is unique termination: R has the unique termination property if 
any two terminating sequences of reductions starting from the same expression have identical 
final expressions. When R has the unique termination property the normal form (if it exists) 
of every expression is unique. A rewriting system that has both the uniform termination 
property and the unique termination property is said to be convergent. When R is convergent 
every expression a has exactly one normal form; we denote the unique normal form of a in a 
convergent system by a*. 

The rewriting systems corresponding to our input specifications are obtained by 
simply replacing the symbol '=' by the symbol '-►' in each of the equations in the 
specifications. For example, Fig. 11 gives the rewriting system corresponding to the 
specification of Queuejnt in Fig. 7. Henceforth, we treat the input specifications as rewriting 
systems obtained as explained above. When we refer to a specification, we actually mean the 
rewriting system obtained from the specification. 
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Fig. 11. The Queuejnt Rewriting System . 

(1) Front(Nullq) -► ERROR 

(2) Front(Enqucuc(Nullq, e)) — * e 

(3) Front(Enqueue(Enqueuc(q, el), e2)) -+ Front(Enqueue(q, el)) 

(4) Dequeue(Nullq) -+ ERROR 

(5) Dcqucue(Enqucuc(Nullq, c)) — ♦ Nullq 

(6) l)cqueuc(Enqucuc(Enqucuc(q, el), e2)) — ► Enqucue(Dcqucue(Enqucue{q, el)), e2) 

(10) Appcnd(q, Nullq) — ► q 

(11) Appcnd(ql, Enqueuc(q2, e2)) — ► Enqueuc(Append(ql, q2), e2) * 

(12)Size(Nullq)-*0 

(13) Size(Enqucue(q, e)) -> Size(q) + 1 



3.3.2 The Principle of Definition 

The principle of definition is a property of a specification (or a group of 
specifications). The property ensures the consistency of a specification. The property 
reinforces the two-tier characteristic inherent in our specifications: It ensures that the 
generators are specified among themselves, and the nongenerators are specified as total 
functions in terms of the generators. Finally, the property is useful in mechanically proving 
properties of data types from their specifications. The property is similar to a property with 
the same name defined in [22]. Our definition is more general than the one in [22]. 

Definition The Principle of Definition 

A specification (or a group of specifications) S has the principle of definition property if every 
constant t has exactly one normal form (in S), and the normal form is a generator constant of 
the appropriate type. 

There will be situations in our development when it is necessary to use a restricted 
version of the principle of definition. The notion is restricted in the sense that the principle 
of definition need hold good only for a subset of terms. The restricted property is useful in 
stating that every nongenerator defined by a system be defined as a total function on a subset 
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.* of the value set of a type. We give a definition the property below. « 

Definition Principle of Definition With Respect T 

: Let T be a set of generator constants not necessarily including all possible constants. A 

. system S satisfies the principle of definition with respect to T if the following condition holds: 

Every constant of the form F(g r . . . , g^, where F is a nongenerator function symbol and 

g v . . . , g n are generator constants in T, has a unique normal form (in S) that is a generator 

constant in T. 

The principle of definition has two parts to it: It requires every constant to have a 
unique normal form in S, and the normal form to be a generator constant SCPD has to be 
formulated so as to ensure the two parts. The first part can be ensured by requiring S to be 
convergent (i.e„ to satisfy the uniform termination property and the unique termination 
property). The second part is ensured by requiring S to be well-spanned We define what it 
means for S to be well-spanned below, and then show how the two properties ensure the 
principle of definition of S. 

Consider the rewriting system shown in Fig. 11. The system has three rules (1, 2, 
and 3) in which the expression on the left hand side has Front as its outermost symbol. The 
set, {Nullq, Enqueue(Nullq, e), Enqueue(Enqueue(q, el), e2)}, of generator expressions that 
appear as arguments to Front on the left hand side in the rules spans the entire set of 
generator constants of Queuejtot; in other words, every generator constant of type 
Queuejnt is an instance of one of the expressions in the above set When a rewriting system 
has enough rules corresponding to a nongenerator function f so that the set of generator 
expressions appearing as arguments to f spans the set of all generator constants, we say that f 
is well-spanned by the rewriting system. We say that a rewriting system is well-spanned if 
every nongenerator function symbol of the system is well-spanned. We formalize this notion 
below. 

In general, since f can be multi-ary, the arguments to fare k-tuples of expressions of 
appropriate types, where k is the arity off. In the following formalization, we first define the 
notion of a set of k-tuple of generator expressions being well-spanned, informally, a set of 
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«■: k-tuples of generator expressions is well-spanned if it spans the set of all k-tuples of generator 
constants of appropriate types. The property of a function being well-spanned is defined in 
terms of the notion of a well-spanned set of k-tuple of generator expressions. In the 
following, we assume that the k-tuples are homogeneous with regard to the types of their 
components. The extension to the heterogeneous case is simple. 

Definition A set A = {Aj, . . . , A } of k-tuples of generator expressions A. = <e u , . . . , e jk > is 
well-spanned if the following condition holds: For every k-tuple, <tj, . . . , t k >, of generator 
constants there exist n, 1 < n < p, and a substitution ct, such that for every j, 1 < j < k, we 
have t. = <r(e .). . 

Definition A nongenerator function f is well- spanned by a rewriting system R if there is in R a 
set of rewrite rules whose left hand sides are of the form l(e ir . . . , e jk ), 1 < i < p, and the set 
{ <e u , . . . , e jk > 1 1 < i <. p } is complete. 

Definition A rewriting system R is well-spanned if every nongenerator function symbol in R is 
well-spanned. 

Definition A specification. S satisfies the sufficient condition for the principle of definition 
(SCPD) if S satisfies the following conditions: 

(i) S is convergent 

(ii) S is well-spanned. 

Lemma If S satisfies SCPD then S satisfies the principle of definition. 

Proof Condition (i) guarantees that every constant has exactly one normal form Condition 
(ii) implies that every constant of the form ffojj, . . . , gj, where f is a nongenerator and 

g p , ^ are generator constants is reducible. Since S satisfies uniform termination 

property, this means that no constant with a nongenerator can be a normal form. Hence the 
normal form of every constant is a generator constant 

Q.ED 
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3.3.3 Checking the Principle of Definition 

The main reason for formulating SCPD is so that we might be able to develop 
effective methods of checking if a specification satisfies the principle of definition. This 
section sheds some light on this topic. 

To check if a specification is well-spanned, we have to check if the set of expressions 
(or k-tuples of expressions) that appear as arguments to each of the implementing functions is 
complete. Huet in [22] has demonstrated that it is possible to come up with an effective set of 
conditions that is sufficient to check if a set of expressions is complete. 

Checking the convergence of a set of rules, which forms the remaining condition of 
SCPD, has been investigated in [28, 22]. The result in the cited works, which is due to Knuth 
and Bendix, provides an algorithm (henceforth referred to as the KB-algorithm) to check the 
convergence of a finite set of rewrite rules that satisfies the uniform termination property. 
Thus, if we can independently ensure the uniform termination property of a specification, 
then we can use the KB-algorithm to show the unique termination property of the 
specification. 

3.3.3.1 Checking Unique Termination 

Let R be a finite set of rewrite rules that has the uniform termination property. The 
following theorem is the basis for the KB-algorithm for checking the unique termination 
property. The theorem depends upon the concept of unification of expressions. We will first 
define this concept 

Two expressions a and /? with disjoint variable sets are said to be unifiable if there 
exists a substitution such that 0(a) = 8{fi)P The most general unifier of two unifiable 
expressions a and fi is the unifier 8, such that for any unifier a of a and /? there exists a 
substitution p such that a is the composition of p and 0. The unification algorithm of 
Robinson [44] can be used to determine a most general unifier of two given expressions or 



13. The symbol = stands for two expressions being identically equal. 
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I decide that they are not unifiable. In the discussion to follow we assume that the candidates 
for unification have variables renamed if necessary to obtain disjoint variable sets. 

Let Yj -♦ 5j and y 2 -» 8 2 be two rules of R so that y l is unifiable ^with a nonvariable 
subexpression of y r More precisely, there exists an occurrence u in y 2 such that o = y 2 /wis 
not a variable, and a is unifiable with y r Let 6 be the most general unifier of a and y r 
Then, we say that 0(y 2 ) is a superposition of y t on y r (If /? is either a superposition of y t on y 2 
or a superposition of y 2 on y,, then we say that p is a superposition between y 1 and y 2 .) To 
each superposition there corresponds a critical pair <a v a 2 > of expressions defined as follows. 
«j and a 2 are the expressions obtained by applying to 0(y 2 ) the above two rules, respectively. 
More precisely, 

ai = 0(y 2 )[u*- 6(8$ 

a 2 = 0(5 2 ) 
For example, consider the following rules 
Append(ql, Enqueue(q2, i2)) -+ Enqueue(Append(ql, q2), i2) 
Append(Append(q3, q4), q5)) -+ Append(q3, Append(q4, q5)) 

y x is unifiable with the entire expression y 2 by the most general unifier - [Append(q3, q4) 
for ql, Enqueue(q2, i2) for q5J, yielding the superposition a and the critical pair <a v o 2 > 
shown below: 

a = Append(Append(q3, q4), Enqueue(q2, i2)) 
Cj = Enqueue(Append(Append(q3, q4), ql), i2) 
o 2 = Append(q3, Append(q4, Enqueue(q2, i2») 

Theorem 1 The KB-Theorem 

If R has the finite termination property, then it has the unique termination property if and 
only if every critical pair <a v a 2 > of R has the property that a t and a 2 have identical normal 
form. 

Proof For a proof see [28, 22]. 

If a finite rewriting system has no superpositions, and therefore, no critical pairs, it is said to 
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be superposition-free. Thus, we trivially have: 

Corollary If a finite rewriting system has the uniform termination property, and is 
superposition-free, then it has the unique termination property. 

For example, the rewriting system in Fig. 11 corresponding to Queuejnt is 
superposition-free. In the next subsection we show that it satisfies the uniform termination 
property. So the rewriting system is convergent 

3.3.3.2 Checking Finite Termination 

A general technique for checking termination of a rewriting system R is to 
demonstrate that it is possible to define a well-founded partial ordering >~ on the set of all 
constants (that can be constructed using the function symbols in R) so that t t -* t 2 implies 
tj >- t r A partial ordering is well-founded if there are no infinite descending sequences such 
as tj >- 1 2 >-... for any constants. Hence, there cannot be any infinite sequence of rewrites 
using R also. Appendix II goes into this topic in greater detail. It describes a theorem that 
provides a useful guideline to define a suitable partial ordering to check the uniform 
termination property of a rewriting system. 

We assume that a well-founded partial ordering >- on expressions is available as an 
input to the synthesis procedure. The ordering >- is used by the synthesis procedure not only 
to ensure the uniform termination property of inputs, but also to ensure that the output 
synthesized terminates. The orderings used in our examples belong to a class of orderings, 
called the lexicographic recursive path ordering [26, 10]. A formal definition of the ordering is 
given in Appendix II. 

3.4 Proving Properties of a Data Type 

The properties of a data type we are interested in are always expressed as equations 
of the form e } s e 2 , where e } and e 2 are expressions, and == denotes the observable 
equivalence relation (see sec. 3.1.2). For instance, the property 

AppeiuKAppendtap q 2 ), q 3 ) = Append^, Append(q 2 , qj) asserts that for every instantiation of 
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the variables by values the expressions on the two sides of the equation yield observably 
equivalent values. Our objective is to prove a property as a theorem from a specification of 
the type. This is crucial to our work because synthesis of implementations involves searching 
for appropriate theorems of the input specifications. In the following, we describe how to 
mechanically prove theorems from a specification that satisfies the principle of definition. 

Definition A Theorem of a Specification 

Let S be a specification (or a group of specifications). Let a be a substitution that maps 
variables to generator constants . An equation e t - e 2 is a theorem of S if for every a the 
constants o(eJ and w(e 2 ) have identical normal forms. 

Note that, the above definition of a theorem gurantees that if e t = e 2 is a theorem of S thenej 
and e 2 always yield observably equivalent values. This is because the principle of definition 
ensures that for every instantiation of the variables (in e t and e^ by generator constants the 
two expressions simplify to the same generator constant This provides a basis for developing 
a method for mechanically proving properties of data types from specifications. 

Note that the reverse of the above implication is not true. This is because we 
require that the input specifications be only consistent (via the principle of definition), but 
not complete [25]. A specification S of a data type D is complete if every equation e t s e 2 
such that e t and e 2 are observably equivalent for D is a theorem of S. The synthesis 
procedure would be more productive if the input specifications are complete. This is because 
it is possible to prove more properties from a complete specification, and hence the synthesis 
procedure might be able to derive a larger class of implementations. 

There are several ways in which the above result can be used to deduce that an 
equation is a theorem of a specification. The methods differ in the reasoning or logic used for 
the deduction. In our development we deal with two kinds of logic: the equational logic, and 
the inductive logic. 

Equational Logic 

In the equational logic ^ = e 2 is deduced to be a theorem of S by checking if e t and 
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e 2 have the same normal form in S. Note that if e^ = e 2 l, then it is obtious that e t and e 2 
have identical normal forms for every substitution of the variables by generator constants, (e* 
denotes the normal form of e.) An equation deduced to be a theorem of S in this fashion is 
said to be a theorem in the equational theory of S. When S satisfies the principle of 
definition every expression is guaranteed to have a unique normal form. Therefore, it is 
possible to develop a general procedure to decide the entire equational theory of S. As an 
illustration, we give a proof of 

Append(Append(q t , q 2 ), Nullq) = Append^, Append(q 2 , Nullq)) using the specification of 
Queue_Int shown in Fig. 11. 

Equation to be proved: Appcnd(Appcnd(qj, q 2 ), Nullq) = Append(q,, Appcnd(q 2 , Nullq)) 

Normal form of left hand side: Normal form of right hand side: 

Append(Appcnd(q r q 2 ), Nullq) Appcnd(q,, Append(q r Nullq)) 
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Rulc(10) 



Rule (10) 

Append(q,, qj) Append^, qj) 

Inductive Logic 

A property * is deduced to be a theorem in the inductive logic by using, besides the 
reduces relation -►* , some form of mathematical induction. A property that is deduced 
using the inductive logic is called a theorem in the inductive logic. The set of all properties 
that can be deduced from a specification using the inductive logic is called the inductive 
theory of the specification. 

The induction used is carried over the set of all generator constants using one or 
more of the variables in * as parameters for the induction. The induction is based on any 
well-founded partial ordering on the set of generator constants. Suppose G is the set of all 
generator constants, and >- is a well-founded partial ordering on G. Suppose we are using 
the variable v of $(v) as the parameter of induction. Then the induction rule may be stated as 
follows: 

Induction rule 
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If for every t € G we can show that, for every t ' € G such that t >- 1 ' , $[v A ' ] => <J>[v/t], then 
0(v) is theorem. 

To apply the induction rule, we have to define a partial ordering >~ on G. Since G 
can, in general, be infinite the definition of >~ is usually recursive. The step of showing 
$[ v /t ] => $[v/t], for every t >- 1 , is fragmented into several cases. Each of these cases is 
established using the relation -»* as was done in the equational logic. Fig. 12 gives an 
example of an inductive proof. It proves the property 

Appcnd(Append(q r q 2 ), q 3 ) = Append(q } , Append(q 2 , q^) from the specification of Queue Jnt 
given in Fig.ll. The proof uses an ordering generated by the following relation on the 
generator expressions of Queue Jnt: Enqueue(q, i)>- Nullq, and Enqueue(q, i) >- q. The 
proof uses the variable q 3 as the parameter of induction. 

It is not possible to develop a general procedure to decide the entire inductive 

Fig. 12. Proof by Inductive Logic 

Theorem to be proved: AppciHKAppencKq,, q 2 ), qj) = Append^, Appcnd(q 2 , qj) 

Basis: q^ t~* Nullq 

To prove: AppendCAppentHq^ qj, Nullq) s Append(q Jf Appcnd(q 2 , Nullq)) 

Proof is demonstrated above. 

Induction: q 3 h+ Enqueue(q, si) 

Hypothesis: Append(Append(q r q^, q) —► Append(q,, Append(q 2 , q)) 

To prove: Appcnd(Appcnd(qj, qj, Enqueue(q, i)) = Append(q,, Append^ £nqueue(q, i))) 



Normal form of left hand side: Normal form of right hand side: 

AppendfAppernKq,, q 2 ), Enqueue(q, i)) Append(q } , Append(q 2 , Enqueue(q, i))) 

Rule(ll) I J, RuMH) 

Enqueue(Append( Appendfap qj, q), i) Append(q p EnqueueCAppendXqj, q), I)) 

Hyp. I 1 RUle(11) 

Enqueue(Appeod(q r Append(q 2 , q)), i) EnqueueCAppeudfa,, Append(q r q)), i) 
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} theory of S. This is because the inductive hypotheses necessary for the proof cannot be 
generated automatically in all situations. However, when S satisfies the principle of 

s definition a significant number of interesting properties in the inductive theory can be proved 
automatically. The automatic method, first developed by Musser [38, 22], is based on the 
Knuth-Bendix algorithm (see sec 3.3.3.1) for checking convergence of a rewriting system. We 
use this method for synthesizing implementations whose proofs of correctness need 
induction. We will explain the method in chapter 4 while describing synthesis in the 
inductive theory. 
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4. Stage 1: The Preliminary Implementation 

This chapter discusses the preliminary implementation of a data type, and develops 
, a method to derive it from the inputs to the synthesis procedure. A distinguishing 
characteristic of the method outlined is that it is based on a method for proving the 
correctness of a preliminary implementation. The chapter is organized into the following 
sections. The first section defines precisely what constitutes a preliminary implementation. 
The second section gives a mathematical formulation of the problem involved in the 
derivation of a preliminary implementation for a data type from the given inputs. For 
convenience, the problem is formulated, and solved here for a situation where the 
representing domain is identical to the representation value set In the next chapter, we 
extend the derivation problem to the more general situation where the representing domain is 
a subset of me representation value set The last section describes a procedure to derive the 
preliminary implementation from the input specifications. 

4.1 A Preliminary Implementation 

A preliminary implementation of a data type is an implementation for the 
implemented type in a rewrite rule language. The preliminary implementation uses a 
representation scheme that is consistent with the one characterized by the association 
specification supplied by the user. It consists of two parts: The Representation part, and the 
Definitions part 

The Representation part gives the representation type used for the implementation 
of the implemented type. We call the values of the representation type the representation 
values, and the set of representation values the representation value set. Only a subset of the 
representation value set need be used to represent the values of the implemented type. This 
subset is called the representing domain, and is characterized by the association specification. 

The Definitions part contains definitions for a set of new functions on the 
representation values. We call the new functions the implementing Junctions. There is an 
implementing function corresponding to every operation of the implemented type; the 
former implements the latter. The definition of an implementing function that implements 
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an operation is called the preliminary implementation of that operation. * An implementing 
function is not necessarily a total function on the representation value set However, it has to 
be defined on every value of the representing domain. We use the following convention 
throughout the development to help associate an implementing function with the operation 
of the implemented type it implements: The identifier that denotes an implementing function 
is the capitalized version of the identifier that denotes the corresponding abstract operation. 
For instance, NULLQ is the implementing function of the operation Nullq. 

The Definitions part consists of a set of rewrite jrules of the form e t — ► e r The 
rewrite rules in the Definitions part defining an implementing function F are the ones that 
have F as the outermost symbol on their left hand side. e t and e 2 are expressions that may 
contain the implementing functions, the operations of the implementing types, and 
if_then_else with the following constraints: 

(1) The only operations of the representation type that may appear in e t and e 2 are the 
generators of the type. 

(2) e % and e 2 may not contain any auxiliary (or helping) functions other than 
if_then_ebe. 

There are two reasons for constraining the preliminary implementation. Firstly, we 
would like to constrain the structure of the preliminary implementation so that the synthesis 
procedure has to perform less work in searching for the desired solution. Secondly, we want 
to keep the language as simple as possible so that the principle behind the synthesis method is 
brought out more clearly in our description. 

The first constraint is imposed to keep the preliminary implementation derivation 
problem simple. This constraint permits us to ignore several axioms in the specifications of 
the implementing types during verification as well as synthesis of a preliminary 
implementation. In particular, the only axioms in the specification of the representation type 
that we need to consider are the ones that involve only the generators of the type involved in 
the specification. This is because only the generators of the representation type may appear 
in the preliminary implementation. To this extent this constraint simplifies the synthesis 
method. An implementation that also uses the rest of the operations is derived in the next 



-64 



? stage of the synthesis as a transformation of the preliminary implementation. 

The second constraint, in general, restricts the logical power, i.e., the ability to 
define any computable function on the representation type, of' the preliminary 
implementation language because the constraint prohibits the use of any helping (or 
auxiliary) functions (except if_then_else) in a preliminary implementation. Our synthesis 
method cannot automatically discover the helping functions that might be necessary in the 
preliminary implementation. We use two approaches to get around this problem; both the 
approaches amount to relaxing the second constraint They are explained here briefly, but 
are illustrated more clearly when we later consider examples involving them. 

The first approach consists of seeking help from the user. We require the user to 
furnish a specification of the helping function needed in the preliminary implementation. 
We then relax the second constraint to permit the use of the helping function in the 
preliminary implementation. 

The second approach consists of introducing a new construct into the preliminary 
implementation language. The construct, which is used primarily in conjunction with a tuple 
type, helps eliminate the need for helping functions while defining several functions on tuple 
types. The motivation for paying special attention to tuple type is because a tuple type is a 
commonly used representation type. The construct provides a way of accessing the 
components of a tuple being returned by an expression of tuple type without explicitly using 
the operations that select the components of a tuple. This construct may be used in 
expressions that appear on the right hand side of an equation of a preliminary 
implementation. The construct is expressed by means of an expression with the following 
syntax: 

e 2 where <\ v . . . , v n > is e 22 

In the above, \ v . . . , v n are variables; e 22 is an expression of n-tuple type; e 2 is an expression 
that may contain the variables \ v . . . , v . The construct binds, in order, v r . . . , v n to the 
components returned by e 22 . The scope of the binding is limited to the expression e r For 
example, consider the expression 

<Assign(vl, e, jl), il, jl + 1> where <vl, ii, jl> is DEQUEUE(<v, i, j>). Assuming 

DEQUEUE is a function from Triple to Triple, the variables v r i v and j t in the above 
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expression are bound to the components of the triple returned by DEQUEUE(<v, i, j>). 
4.2 The Preliminary Imlementation Derivation Problem 

Our intention is to study the problem of synthesis within the data type verification 
framework. So we formulate the problem of deriving a preliminary implementation as 
roughly the inverse of the problem of proving the correctness of the preliminary 
implementation. 

First, we develop the criterion of correctness of a preliminary implementation. 
Then, we formulate the problem of verifying if a preliminary implementation meets the 
correctness criterion. We define the derivation problem after that For convenience, the 
verification problem and the derivation problem are formulated here for a situation in which 
the representing domain is identical to the representation value set This situation 
corresponds to the case where the abstraction function is total, and die invariant part of the 
association specification is vacuous. We discuss the derivation problem for a situation where 
the representing domain is a subset of the representation value later. It should be noted that 
the formulation of the correctness criterion given below applies to both situations. 

4.2.1 The Criterion of Correctness 

Informally, for a preliminary implementation to be correct, the implementing 
functions it defines should collectively exhibit a behavior that is consistent with the 
observable behavior characterized by the specification of the implemented type. Also, the 
preliminary implementation should use a representation scheme that meets the requirements 
of the association specification given as input Let us formalize this intuitive notion. 

The formal object that a preliminary implementation is denoting can be considered 
to be a heterogeneous algebra, called the implementation algebra, with the following 
components: 

(i) A principal domain that is a subset of the representation value set The principal 
domain is defined as the set of all values of the representation type that are 
"reachable" through the implementing functions corresponding to the constructors 
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of the implemented type. In other words, the principal domain is the set of 
representation values generated by the closure under functional composition of the 
implementing functions conresponding to the constructors of the implemented type. 

(ii) A domain corresponding to every defining type of the implemented type. We 
assume that this domain is ! identical to the value set of the corresponding defining 
type. 

(iii) a function corresponding to every implementing function defined by the preliminary 
implementation. 

A preliminary implementation is correct if the implementation algebra it denotes is 
a model of the implemented data type in a manner constrained by the association 
specification. This means that there exists a homomorphism from the implementation 
algebra to the the implemented type that behaves as an identity function on the values of the 
defining types, and exactly like the abstraction function characterized by the association 
specification on the values of the representation type. 

Let % denote the representing domain, and JL denote the abstraction function 
specified by the association specification. Let % be a function defined as below. 

D: Implemented Type, <&: Representing Domain, D t D n :The defining types of D 

%: 3b U D, U . . . U D -> D U D, U . . . U D 
a:ft->D 

%(f) = Mf) ifr€9b 
r otherwise 

A preliminary implementation of a data type is correct with respect to the association 
specification JL , if the following two conditions hold. 

(1) Totality Property:E\ ery implementing function is total over ft. 

(2) Homomorphism Property: The operation f of the implemented type and the 
implementing function F are related by the property: 
(V r € 3b)[DG(F(..., r ,.„)) = f(..., 36(r) ,...)] 
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The correctness criterion formulated above is different from the formulation found 
in the literature on data type verification [25, 14, 18] which is not formulated with respect to a 
given homomorphism %. According to the conventional formulation a preliminary 
implementation is correct if there exists a function % from the representation value set to the 
value set of the implemented type so that: For all r € the principal domain, 
Dfi(F(..., r ,...)) = *(.«, 3£(r) ,...). Thus, according to this criterion the implementing functions 
are not required to be total with respect to %. Note that the principal domain can be a subset 
of %. What distinguishes our formulation is the requirement that F be total over <&, and also 
satisfy the homomorphism property over &. 

Our formulation is more useful in the context of synthesis. It enables us to 
determine a principal domain of the implementation algebra (which, in turn, determines the 
set of representation values on which every implementing function should be defined) 
directly from the association specification. This reduces the interdependence of the synthesis 
of preliminary implementation for the various operations of the type. This is because in other 
formulations the principal domain has to be determined by computing the closure under 
composition of the implementing functions of the constructors. Thus the domain of the 
implementing function of each of the constructors is, in general, dependent on the behavior 
of the implementing function of every other constructor. 

The totality requirement is also more interesting in the context of synthesis. In the 
synthesis process the association specification initiates the derivation of an implementation by 
defining the representation scheme to be used. The association specification is expected to 
express the intention of the user regarding the representation scheme he wants the 
implementation (to be derived) to use. So it is logical to assume that the user wants the entire 
representing domain characterfced by the association specification to be used for representing 
the values of the implemented type. 
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; 4.2.2 The Derivation Problem i 

The goal of the derivation problem is to derive a preliminary implementation from 
the given inputs so that the preliminary implementation meets the correctness criterion. The 
inputs consist of the specification of the implemented type, the specification of the 
implementing types, and the homomorphism specification. The homomorphism specification 
is a specification of the homomorphism 3G that the preliminary implementation ought to 
obey. This specification is easily derived from the specification, of the abstraction function J. 
(given as a part of the association specification). The Homomorphism Specification contains 
two kinds of rewrite rules obtained as described below. The first set of rules specifies that % 
behaves exactiy like the abstraction function on the representation values. The second set of 
rules specifies that % behaves as an identity function on the values of all the ancillary types. 
More precisely, 

(1) if JL(eJ = e 2 belongs to the abstraction function specification 
then %(ej = e 2 belongs to Homomorphism Specification 

(2) if c is a generator of an ancillary type 

then %(o(v v ..., yj) == a^fy), . . . , !)€(*„)) belongs to Homomorphism Specification 
Let us call the combination of all the input specifications the Input World (IW). The 
restrictions on the inputs (see sec 2.3.1 of the previous chapter) ensure that the Input World 
satisfies the principle of definition. The strategy behind the method used in deriving the 
preliminary implementation is based on the principle of definition property. 

Suppose IW is supplemented with a set of rewrite rules, called the DG-rules, that 
express the homomorphism property a preliminary implementation is expected to satisfy: For 
every pair of an operation f of the implemented type, and its implementing function F there 
exists an Dfi-rule of the form DG(F(Vj rt . . . , vjl) — + f(%(vj n ..., ^(v^). Let us call the 
supplemented system the Perturbed World (PW). Let us suppose that the addition of the 
3€-rules does not destroy the uniform termination property of IW. The reason we refer to the 
supplemented system as the Perturbed World is because the addition of the D£-rules destroys 
the principle of definition property. PW does not satisfy the principle of definition because 
the implementing functions that are newly introduced into the system are as yet undefined 



69- 



* A constant involving the implementing function symbols does not simplify to a generator 
constant 

I Recall that the principle of definition is a formal expression of the requirement that 

every nongenerator function in a system be completely defined as a total function. If we can 
generate a set of rewrite rules that can restore the principle of definition property of PW, then 
the new set of rules can be considered as a complete definition for the implementing 
functions. Thus, preliminary implementation derivation is a problem of restoring the 
principle of definition of a system that violates it 

More precisely, the problem involved in synthesizing a preliminary implementation 
consists of deriving from the Perturbed World a set of rewrite rules, PI (the acronym stands 
for preliminary implementation), so that 

(1) PI U IW satisfies the principle of definition, as well as 

(2) PI U PW satisfies the principle of definition. 

In the following, we give a formal proof that the above conditions guarantee the correctness 
of the preliminary implementation. 

The Correctness Theorem 

Let PI be a set of rewrite rules derived so that the above two conditions hold. Then, PI 
satisfies the criterion of correctness of a preliminary implementation. 

Proof The first condition asserts that PI U IW satisfies the principle of definition. This 
implies that every nongenerator function in the system, which includes every implementing 
function, is defined as a total function. Hence, PI satisfies the Totality Property. 

To show that PI satisfies the Homomorphism Property, we have to show that every 

equation of the form 3G(F(Vj v^) =s 1(36^),, .... 36^) is a theorem of PI U IW. The 

argument to show that the second condition implies this is based on the following interesting 
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result about any system that satisfies the principle of definition. The result, ' which is proved 
as Theorem 6 in Appendix III, enunciates a sufficient condition for an equation to be a 
theorem of a system that satisfies the principle of definition. Suppose S is a system that 
satisfies the principle of definition, and e l = e 2 is an equation so that ^ and e 2 have at least 
one nongenerator function symbol in them. Then, e x = e 2 is a theorem of S if S U {e % -* e 2 } 
satisfies the principle of definition. The result is proved in the Lemma to follow. 

Because of the second condition PI U PW satisfies the principle of definition. Since 
PW is IW U 3€-rules, this implies that (PI U IW) U DC-rules satisfies the principle of 
definition. Now, by the first condition PI U I W satisfies the principle of definition. By 
applying the above result, each of the 36-rules (when treated as equations) is a theorem of 
PI U IW. Note that the result can be applied because the DC-rules have nongenerator 
function symbols in them. 

Q.E.D. 

4.3 Derivation of a Preliminary Implementation 

In the previous section the problem of deriving a preliminary implementation was 
formulated as deriving a set of rewrite rules, PI, so as to restore the principle of definition 
property to the Perturbed World PW. This section develops a procedure to derive a 
preliminary implementation. The procedure makes two assumptions about its input: (1) The 
Initial World (IW) satisfies SCPD, a sufficient condition for the principle of definition, and 
(2) a termination ordering >- on expressions is available to the procedure to ensure the 
uniform termination property of rewriting systems. 

The obvious strategy for the procedure is to derive the rules of the preliminary 
implementation so that PI U IW and PI U PW satisfy SCPD. But this limits the class of 



14. [22, 38] contain results similar to the one proved in this lemma. The result here is different 
because we have a different set of assumptions. The principle of definition property used in [22] is 
more constrained than the one we have. The result in [38] assumes that S satisfies a completeness 
property called fully specifiedness which is not assumed here. This is the reason for the requirement 
in the lemma that e t and e 2 should have at least one nongenerator function symbol in it 
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implementations that can be derived by the procedure. So, we develop another set of 
conditions, called the synthesis conditions, that is weaker than SCPD. PI is generated so that 
it satisfies the synthesis conditions. It can be shown that when PI satisfies the synthesis 
conditions, PI U IW and PI U PW satisfy the principle of definition. We first formulate the 
synthesis conditions, and then develop a procedure to derive a set of rules that satisfies the 
synthesis conditions. 

4.3.1 The Synthesis Conditions 

The synthesis conditions for a set of rewrite rules PI are the following: 

(1) Totality Condition: 

(a) PI is well-spanned (for every implementing function) with every rule in it 
being of the form F(gj, . . . , g^ -f t, 15 where F is an implementing 
function symbol, and g v . . . , g n are generator expressions. 

(b) PI satisfies the uniform termination property. 

(2) Uniqueness Condition: PI has the unique termination property. 

(3) Homomorphism Condition: For every rule F(g r . . . , g^ -+ 1 in PI, 
D£(F(g r . . . , g^) 3 3t(t) is a theorem of PW. 

The following Synthesis Theorem shows that when PI satisfies the synthesis conditions, 
PI U IW and PI U PW satisfy the principle of definition, and hence, by the Correctness 
Theorem, PI is correct An informal motivation for the conditions can be given as follows. 
The Totality Condition ensures that every implementing function is defined on all the values 
of the representation type, and it terminates on each of them. The Uniqueness Condition 
ensures that every implementing function is well-defined, in the sense that it yields a unique 
value for every argument value. The Homomorphism Condition ensures that the preliminary 



15. Note that the syntactic constraint on a preliminary implementation requires that t may contain 
neither the function symbol %, nor any of the operations of the implemented type. 
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; implementation satisfies the Homomorphism Property. 

The Synthesis Theorem 

If PI satisfies the synthesis conditions, then PI U IW and PI U PW satisfy the principle of 
definition, and hence PI is a correct preliminary implementation. 

Proof It is easy to see that PI U IW satisfies the principle of definition because the Totality 
Condition and the Uniqueness Condition imply that preliminary implementation satisfies 
SCPD, and IW satisfies SCPD by our assumption about the inputs. 

Let NW denote PI U PW, for convenience. We apply Theorem 8 (Appendix III) to 
show that NW satisfies the principle of definition. According to that theorem, a rewriting 
system S satisfies the principle of definition if 

(a) S is well-spanned, 

(b) S has the uniform termination property 

(c) Every critical pair <« 1 , a 2 > of S is such that a l = a 2 is a theorem of S. 

We show that NW satisfies all three premises of the above theorem. NW is well-spanned. 
This is because TW is well-spanned by our assumption, and PI is well-spanned by Totality 
Condition (a). The only nongenerator function symbols of NW are the ones in IW and PI. 
By Totality Condition (b) PI has the uniform termination property, so NW has the uniform 
termination property also. The following lemma shows that NW satisfies premise (c). 

Q.ED. 

Lemma Every critical pair <e r e 2 > of NW is such that e t = e 2 is a theorem of NW. 

Proof Note that PW is convergent. This is because IW is convergent by assumption, and the 
3G-rules added to IW do not give rise to any new critical pairs. 

NW is constructed from PW by adding PI to the former. Therefore, any new 
critical pairs of NW would be generated as a result of a superposition of die rules of PI on the 
rules of NW. Because of Totality Condition (a) on the form of the rules in PI the only rules 



-73 



on which the rules of PI can have a superposition are the following: 

(I) The rules of PI themselves, or 

(II) the rules of the implementing types, 

(III) the DG-rules. 

Every critical pair <e r e 2 > determined by a superposition on the rules in 
category (1), and (II) is such that e^ is identical to e 2 *. This is because, by the Uniqueness 
Condition, PI satisfies the unique termination property. Hence, e t == e 2 is a theorem of NW. 

Every critical pair determined by a superposition of the rules in category (HI) is of 
the form <3€(F(g r . . . , g^), %(t)>, where F(g 1 ,...,g n )-» t is a rule in PI. By the 
Homomorphism Condition, D6(F(g 1 , . . , , g n )) = %(t) is a theorem of PW, and hence a 
theorem of NW. 

Q.ED. 

4.3.2 Derivation of the Rules of PI 

The rewrite rules of PI are derived from the Perturbed World (PW). So the initial 
task of the derivation procedure is to construct PW. PW is a rewriting system that includes 
the Initial World (IW) and the 3€-rules. IW is constructed by combining the specification of 
the implemented type, the specifications of the implementing types, and the Homomorphism 
Specification. Without any loss of generality, we assume that there is no conflict among the 
names of the various function symbols in the specifications. PW is formed by then adding a 
rule of the form 3€(F(Vj„ . . . , vj) -* fC^fy),, . . . , %(yj) for every implementing function F 
to be defined. We assume that the termination ordering >- being used by the synthesis 
procedure is such that D€(F(v 1 „ . . . , vj) >- f(%(\^„ .... D6(v n )), for every implementing 
function. This ensures that PW retains the uniform termination property as desired by the 
derivation problem. Note that this is not a restriction because the implementing function 
symbols (in the DG-rules) are fresh symbols being introduced into IW. Hence, an appropriate 
ordering can always be found. 

Although PW is defined to include the specification of every implementing type 



74 



completely, it is not necessary to do so. Since the derivation method does not require the 
specifications to be complete, one; may include only parts of the specifications of the 
implementing types. The advantage of doing so is that the fewer rules in PW the more 
efficient it is to derive the preliminary implementation. However, by not including certain 
rewrite rules one might be excluding^certain implementations. 

Let us illustrate the construction of PW on an example. We consider the derivation 
of an implementation for Queuejnt with Circ_List as the representation type using the 
association specification given in Fig. 9 in the previous chapter. Fig. 13 gives the rules of PW 
for the example under consideration. The rules of the types Integer and Bool, which are also 
among the implementing types are omitted from the figure for convenience. The rules of the 

Fig. 13. The Perturbed World 

(1) Front(Nullq) -» ERROR 

(2) Front(Enqucuc(NuIlq, e)) — ► e 

(3) Front(Enqueuc(Enqucuc(q, el), c2)) — > Front(Enqueue(q, el)) 

(4) Dequeuc(Nullq) — ERROR 

(5) Dequeuc(Enqucue(Nullq, e)) —► Nullq 

(6) Dequeuc(Enqucue(Enqueiie(q, el), e2)) — ► Enqueue(Dequeuc(Enqueue(q, el)), e2) 

(10) Appcnd(q, Nullq) — » q 

(11) Append(ql, Enqueue(q2, e2)) — > Enqueue(Append(ql, q2), e2) 

(12) Empty(Nullq) -+ True 

(13) Empty (Enqueue(q, e)) -♦ False 

(14) D€(Create) -» Nullq 

(15) Dfi(Insert(c, i)) -+ add_at_head(D6(c), 3fi(i)) 

(16) add_at_head(Nullq, i) — ► Enqueue(Nullq, i) 

(17) add_at_head(Enqueue(q, i), il) -+ Enqueue(add_at_head(q, il), I) 

(19) K(NULLQO) -> Nullq 

(20) 3£(ENQUEUE<c, i)) -» Enqueue^), %(\)) 

(21) D6(DEQUEUE(c)) - Dequeue(D6(c)) 

(22) Dfi(APPEND(cl, c2)) -+ Append(36(cl), D€<c2)) 

(23) Dfi(EMPTY(c)) -» Empty(D6(c)) 
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representation type Circ_List are omitted because they are not going to be used in the 
derivation of the preliminary implementation. This situation arises because a preliminary 
implementation is permitted to use only the generators of the representation type. So, the 
only rules of the representation type needed in verification, and hence also in the derivation 
of a preliminary implementation, are the ones that contain only the generators. Since 
Circ_List does not have any rules of this kind, Circ_List does not contribute any rules to IW. 
Rules (1) through (13) in the figure are rules of Queue Jnt; rules (14) through (17) are the 
rules of Homomorphism Specification. 

The next task is to derive the rewrite rules of PI from PW. Strictly speaking, PI 
should be derived so that all the three synthesis conditions are satisfied. But, it is more 
convenient to develop a procedure that derives the rewrite rules so that only the Totality 
Condition and the Homomorphism Condition are met The effect of ignoring the 
Uniqueness Condition is not harmful in the sense that it can be fixed at a later stage by 
post-processing the preliminary implementation. The Uniqueness Condition ensures that 
every implementing function defined by PI returns a unique value on every representation 
value. When the Uniqueness Condition is not satisfied, an implementing function F being 
defined by PI may be nondeterministic: That is, F can be so that F(v) = v r and F(v) = v 2 , 
but Vj & v 2 ; however, both the values Vj and v 2 will represent the same value of the 
implemented type. The nondeterministic behavior, if any, in the preliminary implementation 
will be eliminated by our synthesis procedure in the second stage while deriving a target 
implementation. The semantics of the target implementation language is such that it is 
impossible to define nondeterministic functions. 

The procedure derives the preliminary implementation for one operation at a time 
by deriving a separate set of rewrite rules for every operation. The method used is the same 
for every operation. The procedure first determines the left hand sides of all the rules of the 
preliminary implementation. Then, it determines a suitable right hand side for each of the 
rules from the already determined left hand side. 
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4.3.2.1 Determining the Left Hand Side 

The Totality Condition is used to determine the left hand side of the rules. The 
Totality Condition has two parts: The first part requires PI to be well-spanned, and the 
second part requires PI to have the uniform termination property. The second part is 
ensured while deriving the right hand side, which will be discussed later. The first part is 
used here. 

The well-spannedness property (described formally in sec 2.3.1 of the previous 
chapter) requires the left hand side expressions of the rules defining an implementing 
function F to satisfy the following property: The set of generator expressions the appear as 
arguments to F on the left hand side should span the set of all generator constants. More 
precisely, suppose the preliminary implementation of F consists of the following set of rules: 
(In the following the question mark identifiers are used as place holders for expressions to be 
determined later.) 

F( gl ) - Ttj 

Then, the set {g r ...-,g^} should be well-spanned (see sec 2.3.1), i.e„ span the set of all 
generator constants of the appropriate implementing type. For instance, as a concrete 
example, any pair of rules that have the form given below constitute a well-spanned set of 
rules for ENQUEUE. 

ENQUEUE(Create, j) -» ?rhs 2 

ENQUEUE(Insert(c i), j) - 7rhs 3 

Note that the left hand side of each of the above rules consists of ENQUEUE 
applied to arguments that are generator expressions. The set of arguments, i.e., sequences of 
generator expressions, to ENQUEUE on the left hand side of the rules is 
ArgsSet = {<Create,j>,<Insert(c,i),j>}. ArgsSet spans the set of all ordered pairs of 
generator constants because every pair of generator constants (the first one of type CircJList, 
and the second of type Integer) is an instance of one of the arguments in ArgsSet. 

It is easy to build a procedure that automatically generates a well-spanned ArgsSet, 
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once the generators of the representlation type are identified. In fact a slight modification to 
the procedure referred in sec 3.3.3 (which checks if an ArgsSet is complete) can be used to 
generate a complete set of argument expressions. Thus, an appropriate set of left hand sides 
for the rewrite rules to be derived can be determined automatically. 

Fig. 14 gives a possible set of left hand side expressions for a preliminary 
implementation for the example under consideration. Note that the right hand side of each 
of the rules in the figure is denoted by a question mark identifier. So Fig. 14 can be 
considered as a partial preliminary implementation of Queuejnt. 

4.3.2.2 Determining the Right Hand Side 

The right hand side of each of the rules is determined using the already determined 
left hand side so that the Homomorphism Condition and the second part of the Totality 
Condition are met This where the Perturbed World (PW) conies into the picture. 

PW is used to derive a set of equations, called the synthesis equations, one equation 
for every rule in the preliminary implementation. The right hand side of a rule is determined 
from the right hand side of the corresponding synthesis equation. The synthesis equation 

Fig. 14. A Partial Preliminary Implementation 

(l)NULLQO ->?*!!, 

(2) ENQUEUE(Create, j) -» Trhs, 

(3) ENQUEUE(Insert(c, i), j) -» ?rhSj 

(4) FRONT(Creatc) -* ?rhs 4 

(5) FRONT(Insert(c, i)) -» ?rhs 5 

(6) DEQUEUE(Create) -► ?rhs 6 

(7) DEQUEUEflnseiUci)) -+ Trhs, 

(8) APPEND(c, Create) -* Trhs, 

(9) APPEND(c, Insert(d, i)) -> ?rhs, 

(10) SIZE(Create) -» ?rhs tt 
(ll)SIZE(Insert(c,i))-+?rhs u 
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corresponding to a rewrite rule F(g 1 )-> Ttj is an equation of the form D€(F(g 1 ) = ^(Ttj) that 
satisfies the following conditions: 

(1) 3G(F(g 1 ) = D€(?t 1 )isatheoremofPW 

(2) UGOKgj) >- %(7t l ), where >~ is the termination ordering on expressions. 

(3) Ttj contains the implementing function symbols and the permitted operations of the 
implementing types. 

it is easy to see the justification for the above conditions. The first condition 
contributes towards ensuring the Homomorphism Condition. The second condition ensures 
the uniform termination property. The third condition is just a syntactic constraint that any 
rule in a preliminary implementation ought to satisfy. The next section describes in detail a 
procedure to derive the synthesis equations. 

4.4 Deriving the Synthesis Equations 

Every synthesis equation of the preliminary implementation is derived with the help 
of two inference rules called the synthesis rules. The synthesis rules are designed for 
generating theorems of PW that have the same left hand sides, but different right hand sides. 
For deriving a synthesis equation, the synthesis rules are invoked repeatedly a finite number 
of times to generate a series of theorems until the desired equation is generated. For instance, 
the synthesis equation corresponding to the rule ENQUEUE(Insert(c i), j) -+ ?rhs 2 (in the 
partially derived preliminary implementation given in Fig. 14) is derived by generating a 
series of theorems that have 3G(ENQUEUE(Insert(c,i),j)) as their left hand side. The 
generation continues until a theorem whose right hand side qualifies the theorem to be a 
synthesis equation is encountered. 

We investigate two ways in which the synthesis rules can be used for deriving a 
synthesis equation. The first one derives synthesis equations that are in the equational theory 
of PW. The second one derives equations that are in the inductive theory. The second 
method is more general than the first one. A system that implements the synthesis procedure 
would, therefore, use only the second method. We discuss them separately for pedagogic 
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reasons. First, we formulate the synthesis rules. The subsequent subsectiotis describe the use 
. of the synthesis rules in deriving the synthesis equations. 

4.4.1 The Synthesis Rules 

The idea used for generating an equation is to reverse the method of demonstrating 
that the equation is a theorem of PW. The central notion used in the generation is 
expansion. Expansion is the opposite of reduction. It is the act of applying a rewrite rule to 
an expression from right to left 

4.4.1.1 Informal Explanation 

The basis for the synthesis rules is the result given in the KB-Theorem (sec 3.3.3.1). 
The theorem gives rise to the following principle for generating equations that are theorems 
of a convergent system. Suppose e l is an expression that we wish to have as the left hand side 
of the equation. Then, an expression ?e 2 that may appear on the right hand side of any 
equation that has e t as its left hand side should be such that e^ = ?e 2 i. One way of 
ensuring that ?e 2 simplifies to ej* is to obtain ?e 2 by applying to e^ the rewrite rules of the 
system from right Jo. left a finite number of times. We call the mechanism of applying a rule 
to an expression from right to left expand 

We will give a formal definition of expand; and discuss its properties later. Here, we 
will give an approximate description of what expand does so that we may develop a first 
version of the synthesis rule, and illustrate them on the example. Like reduce, performing 
expand consists of several steps. Suppose we wish to expand 
Add_at_head(Enqueue(D€(c), %(})), %(i)) using the rule 

D6(ENQUEUE(c,i))-+Enqueue(36(c),D€(i))- One way of doing this is to look for a 
subexpression (inside the expression to be expanded) that has the form of the right hand side 



16. We will generalize the definition of expand later. At that point one of the synthesis rules needs to 
revised slightly as well. According to the definition given here, expansion is identical to the 
transformation technique folding used by Darlington [7J for synthesis of recursive programs. 
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of the rule. Then replace the subexpression by the corresponding instance of the left hand 
side of the rule. In the present case, the subexpression that appears as the first argument to 
Add_at_head in the given expression matches the right hand side of the rule for the identity 
substitution. The result of expanding the expression is then 

Add_at_head(3€(ENQUEUE(c,j), %(i)). The result of expanding an expression e in the 
occurrence u by a rule y -* 8 is denoted by expand e in u by y-+ 8. We use expand(e) to 
denote any expression that is obtained by expanding e in some occurrence u by some rule 
y -» 5 in the rewriting system under consideration. 

We are now in a position to give the synthesis rules. The first rule specifies how to 
start the generation of a series of theorems; it generates a theorem from a given expression 
without the need for any existing theorem. 

R . ,. e is an expression 

e = el 

The second rule specifies a way of generating a new theorem from an existing one using 
expand. 

e, s e. 
Rule 2: l l 



ej = expand(e 2 ) 

To familiarize the reader with the synthesis rules let us invoke each of the synthesis rules to 
generate a couple of theorems that have 3G(ENQUEUE(Insert(c, i), j)) as their left hand. We 
use the rewrite rules of PW given in Fig.pwl for expansion and reduction. The normal form 
of Dt(ENQUEUE(Insert(ci),j)) is Enqueue(Add_aUieadCJG(c), 3G(i)X 36(D). which is 
obtained by using the rewrite rule (20) and then (15) for simplification. By invoking synthesis 
rule (1) with e = 3G(ENQUEUE(Insert(c, i), j)), we generate the following theorem of PW: 

DG(ENQUEUE<Insert(e, i\ j) s Enqueue(Add_at_head(0€(c), 3t(i», %Q)) 

Let us now invoke synthesis rule (2) on the above equation. Using the rewrite rule (17) to 
expand the entire expression on the right hand side of the above theorem, we can generate 
the following theorem of PW: 

D€(ENQUEUE(Insert(c, i), j) s Add_at_head(D£(ENQUEUE(c, j)), 36(i)) 
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4.4.1.2 Formal Definition of Expand 

Expansion is roughly the reverse of the process of reduction. The relation that 

r 

characterizes a single step of expansion is called expand. Expanding an expression using a 
rule is close to applying the rule to the expression from right to left 

The motivation for introducing the mechanism of expansion is to solve a common 
problem encountered during synthesis: This is to find an expression (a desired expression) 
that simplifies to given expression (the starting expression). For instance, in the derivation 
shown earlier, the starting expression was Enqueue(Add_at_head(Dt(c), DG(i)), 3G(j))» and the 
desired expression was DC(lnsert(ENQUEUE(c, j), i)). 

The definition of expand uses the concept of unification, and the most general 
unifier (see Appendix I). Let t be an expression, and y -» 8 be a rule. We assume that t and 
y have disjoint variable sets. If there are common variables then they have to be renamed 
suitably. Let u be an occurrence in t such that \/u is unifiable with 8; let be the most 
general unifier. Let t be the expression t[u *- 8(y)]. Then, we say that t expands to t by 
y -» 8 in u; we denote this relation by t «- t\ Notice that expanding t by y -» 8 in u is not 
equivalent to reducing t by 8 -* y in u. Expand checks if t/u is unifiable with 5, whereas 
reduce checks if t/u has the form of 5. Therefore, there are situations where an expression is 
expandable by y -* 8, but not reducible by 8 -♦ y. 

The following question arises immediately: Why was expand not defined exactly as 
applying a rule in the reverse direction ? The reason is that a rule y -+ 5 may be such that 
varset(7) D varset(8). Applying such a rule from right to left will result in an expression that 
contains "new" variables, i.e., variables that did not exist in the original expression. The use 
of such variable dropping rule during reduction represents a situation where the reduction 
step caused a "loss" of information: A new variable introduced in an expansion step might 
have had in its place an arbitrary expression during the corresponding reduction step. Our 
goal is to reconstruct, if possible, this lost information at a later stage in the expansion process. 
During expansion, therefore, a variable in an expression has to be treated, in general, as 
though an arbitrary expression might be in its place. Using the predicate unifiable to 
determine if an expression is expandable enables us to do this. 
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For instance, consider !the expansion of Append(q, Nullq) by the rule 
Dequcue(Enqueue(Nullq, e)) -* Nullq. The resulting expression is 

Append(q, Dequeue(Enqueue(Nullq, e))). The variable e is a new variable introduced because 
of expansion. Every instance of the latter expression in which e is replaced by any other 
expression reduces to the former expression. It might be possible to determine the expression 
that has to take the place of e in future expansion steps. 

It should be pointed out, however, that not all variables in an expression need be 
given such a special treatment during expansion. The variables that appear in the starting 
expression must appear as they are in the desired expression we are shooting for. Therefore, 
while expanding an expression, it is necessary to distinguish between the variables in the 
expression that were introduced by a rule (presumably during earlier steps of expansion) and 
the ones that were transferred to the expression from the starting expression. We classify the 
variables involved in expansion into the following two kinds: 

(1) The variables appearing in the rewrite rules; we continue to call these variables. 

(2) The variables appearing in the expressions on the left hand sides of the rewrite rules 
in the partially generated preliminary implementation (Fig. 14). We call these 
variables terminals. Henceforth, we denote terminals by identifiers that are in 
italics. 

The definition of an expression remains as before except that it may also contain 
terminals in it The definition of a substitution also remains as before; it is a function from 
variables to expressions. Thus, when a substitution is extended to be applicable on an 
expression, the terminals in the expression are not substituted for, as we desired. 

In the wake of the formal definition of expand, and the preceding discussion about 
the introduction of variables into expressions due to expansion, we should reconsider the 
formulation of the synthesis rules. The first synthesis rule remains unchanged because it does 
not use the relation expand. The second synthesis rule was formulated as below: 

e i- e 2 
Rule 2: * l 



e'j =5 expand(e 2 ) 
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This formulation is not general enough because it does not account for all the theorems that 
can be derived from e l = e 2 in one expansion step. If expand(e 2 ) has variables in it, then 
every instance of it can potentially be the right hand side of a theorem. Hence, we 
re-formulate the rule as follows: 

e t = e 2 , a is a substitution 
Rule2: t x = a(expand(e 2 )) 

4.4.2 Derivation in the Equational Theory 

As an illustration, let us derive a synthesis equation that is of the form 
36(ENQUEUE(Iiisert(c, i), j)) = SG^rhs^ in the partial preliminary implementation shown in 
Fig. 14. The equation is derived by generating a series of theorems that have 
DG(ENQUEUE0nsert(c, i), j)) as their left hand side. The generation is begun by invoking 
synthesis rule (1) on the left hand side expression. The rest of the theorems in the series are 
generated by invoking synthesis rule (2) using the rewrite rules of PW for expansion. The 
rewrite rules for expansion are chosen with the following ultimate. goal: Obtain a right hand 
side that has the form DGftrlisp so that D€(ENQUEUE(Iiisert(c, % j)) >- tt^rhs.,), and 7rhs 3 
contains only the permitted operations of the implementing types. In the illustration given 
below, the generation of every theorem in the series is considered as a step. At each step, the 
expression expanded, and the rewrite rule used for expansion are indicated. 

Relevant Rewrite Rules of the Perturbed World 

(1) Dt(ENQUEUE(c, j)) -» Enqueue^), %Q)) 

(2) DG(Create) -» Nullq 

(3) 3G(Insert(c, i))-* Add_at_head(3£(c), i) 

(4) Add_at_head(Nullq, i) -* Enqueue(Nullq, i) 

(5) Add_at_head(Enqucue(q, i), j) -+ Enqueue(Add_at_head(q, j), i) 

Form of the theorem to be generated: 36(ENQUEUE(Iiisert(c, i), j)) s 36(^183) 
Normal form of K(ENQUEUE(Iiisert(c, i), j)): Enqueue(Add_at_head(36(cX i), %(})) 

Rules used for the normal form: (1), (3) 

Step (1) Invoke Synthesis Rule (1) on 3fi(ENQUEUE(Iiisert(c, i), j)) 

%(ENQUEUE0nsert(c, i), j)) = Enqueue(Add_atJiead(36(c), i), %Q)) 
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Step (2) Expand Expression: Enqueue(Add_at_head(D€(c), i), D6(j)) 
Using Rule: (5) 

D£(ENQUEUE(Insert(c, i), j)) = Add_at_head(Enqueuc(3G(c), D6(j)), i) 



Step (3) Expand Expression: Enqucuc(3G(c), 3G(j)) 
Using Rule: (1) 

DG(ENQUEUE(Inscrt(c, i), j)) = Add_at_head(OG(ENQUEUE(c, j)), i) 



Step (4) Expand Expression: Add_at_head(3t(ENQUEUE<c, j)), i) 
Using Rule: (3) 

D€(ENQUEUE<Insert(c, i), j)) = 36(Insert(ENQUEUE(c, j), i)) 



The theorem generated in step (4) qualifies to be a synthesis equation. 
Hence the desired rule of the preliminary implementation is: 

ENQUEUE(Insert(c, i), j) -+ lnsert(ENQUEUE(c, j), i) 



4.4.3 Derivation in the Inductive Theory 
4.4.3.1 The General Strategy 

The method used for deriving a synthesis equation in the inductive theory is based 
on the following property that every theorem of PW satisfies: If an equation is a theorem of 
PW, then every instance of it is in the equational theory of PW. An instance of an equation 
e } = e z is an equation obtained by replacing every variable in ej and e 2 by generator 
constants. 

We, therefore, take the following approach. Suppose the synthesis equation we 
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wish to derive is of the form ^(F^j)) = DG(?e u ). 17 We first derive an instance of the desired 
equation: This is done by selecting an instance of the left hand side, say ff(DG(F(e u ))), for 
some substitution a of the terminals in e n to generator constants. Then, an instance of the 
equation <7(3G(F(e n ))) = a(D€(e n )) is derived; the method of derivation for the equational 
theory described earlier can be used for this purpose. The instance of the equation derived 
should be such that a generalization of it ^(F^)) = DG(e 12 ), which is obtained by replacing 
assorted constants by suitable terminals in the instance, is a theorem of PW. 

To check if the generalization is a theorem of PW, we use an automatic procedure 
called Is-an-inductive-theorem-of. This procedure is capable of deciding a significant number 
of theorems in the inductive theory of a system. The procedure will be described in a 
subsequent subsection. Another topic that will be deferred until later is determining a 
suitable a. Any substitution that maps all the terminals in the left hand side of the synthesis 
equation to arbitrary generator constants will serve our purpose. However, the derivation 
would be more efficient if we instantiated as few terminals as possible. A later subsection will 
discuss a method of determining a more judicious way of choosing a. 

In the rest of this subsection, we formalize the notion of the generalization of an 
equation, and then illustrate the general strategy by deriving a synthesis equation 
corresponding to the rewrite rule APPEND(c, Insert^/ )) -► ?rhs 9 in the partial preliminary 
implementation of APPEND given in Fig. 14. 

The Generalization of an Equation 

The generalization of an equation e t = e 2 with respect to a substitution a is the set of 
equations such that e t = e 2 is an instance of using a. When the substitution with respect to 
which the equation is being generalized is obvious from the context, we denote the 
generalization by Genfcj s e 2 ]. Formally, every equation e J s e£ € Genfcj & e 2 ] is such that 
ff(e J) = e r and er(ej) = e r Note that if e t s= e 2 has a finite number of function symbols 
Gen[ej = e 2 ] is always finite. For instance, suppose a is {rff-+ Create}. 



17. Recall that the left hand side of the synthesis equation is already known. 
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Then, Gen[3G(Append(c, Insert(Create, i))) = 3€({APPEND(ENQUEUE(c, /), Create))})]) 

contains the following equations: 

DG(Append(c, Insert(Create, i))) = D€({APPEND(ENQUEUE(c, /), Create))})) 
3G(Append(c, Insert^ i))) = K(APPEND(ENQUEUE(c, /), d))) 

As an illustration let us derive an equation of the form 
!J€(APPEND(c, Insert(40)) = 3C(?ri»s 9 ) which gives rise to one of rules in the preliminary 
implementation of Append. The derivation begins with the choice of the left hand side of the 
instance of the equation to be derived: This has to be an instance of 
D€(APPEND(c, Inserter/))). Let us suppose a is {d*-> Create}. 

Relevant Rewrite Rules of the Perturbed World 

(10) Appcnd(q, Nullq) -» q 

(14) !>G(Crcate) -» Nullq 

(20) DG(ENQUEUE(c, i)) -» Enqueuc(DG(c), 3G(i))}) 

(22) %(APPEND(c, d)) -♦ Appcnd(36(c), 36(d)) 

Form of the theorem to be generated: DG(APPEND(c, Insert(Create, /))) = %(?e) 
Normal form of %(APPEND(c, Insert(Create, /))): Enqueue(3G(c), 36(r)) 
Rules used for the normal form: 

Step (1) Invoke Synthesis Rule (1) on D€(APPEND(c, Inscrt(Create, i))) 
D£(APPEND(c, Insert(Crcate, /))) s EnqucuePG(c), %(i)) 



Step (2) Expand Expression: Enqueue(3G(c), %(i)) 
Using Rule: (10) 



3G(APPEND(c, Insert(Create, /))) = Append(Enqucue(D€(c), 360)), Nullq) 

Step (3) Expand Expression: Nullq 
Using Rule: (14) 

36(APPEND(c, Insert(Create, *))) = Append(Enqueue(56(c), %(i% D6(Create)) 
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} Step (4) Expand Expression: Enqueuc(D€(c), %(i)) 
Using Rule: (20) 

DG(APPEND(c, Insert(Create, /))) = Appcnd(DG(ENQUEUE(c, /)), 3£(Crcate)) 



Step (5) Expand Expression: Append(36(ENQUEUE(c, <)), 3G(Crcate)) 
Using Rule: (22) 

DG(APPEND(c, Insert(Create, 0)) s D€(APPEND(ENQUEUE(c, /), Create)) 



Step (6) Generalize the theorem in step (5) by replacing the constant 
Create by the variable d to obtain the following equation: 
3G(APPEND(c Insert^/ ))) = D€(APPEND(ENQUEUE(c, i% d)) 

Apply Is-an-inductive theorcm-of on the above equation. 
This yields True confirming that the equation is a theorem. 



Hence the desired rule (obtained by dropping % on both sides) is: 

APPEND(c, Insert^,/ )) ^ APPEND(ENQUEUE(c, /), d) 



4.4.3.2 The Predicate Is-an-inductive-theorem-of 

Is-an-inductive-theorem-of is a procedure that is used for checking if an equation 
6j = e 2 is a theorem of a convergent rewriting system S. The procedure is designed so that if 
it yields true on e % = e r then ej = e 2 is a theorem of S; if it yields false, then nothing can be 
said about e x = e r While deriving a synthesis equation in the inductive theory, the 
procedure is used to check if a generalization of an equation is a theorem of PW. The 
procedure is described here. 

The procedure is based on a method of using the KB-algorithm (see sec.3.3.3.1) for 
checking the convergence for proving inductive properties of a rewriting system. Suppose S 
is a convergent rewriting system. To check if ^ = e 2 is a theorem of S, perform the following 
steps: 
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i (1) Form S l = S U {e l -» e 2 (or e 2 -+ e^}. 

(2) Check if S t is convergent. The KB-algorithm of checking convergence (which 
consists of checking if every critical pair <a v a 2 > of Sj is such that a^ = a 2 is 
used for this. 

If the result of step (2) is affirmative, then ^ = e 2 is a theorem; otherwise nothing can be said 
about it, in general. Let us assume that there exists a procedure, called 
Can-be-made-convergent, that implements this method. 

We will first briefly summarize the method, and then describe how 
Is-an-inductive-theorem-of is built on top of it 

The result that provides a basis for the above method is proved in Theorem 7 in 
Appendix III which gives a few useful results about convergent systems. The result is similar 
to the one that was first developed by Musser [38], and that has also been investigated in [22]. 
Our result is different because the cited works assume that S satisfies a notion of 
completeness (similar to the principle of definition) besides convergence. 

In the present situation PW, whose theorems we are interested in, is convergent but 
does not satisfy the principle of definition. • Because of this, the above method is applicable 
only when ej (or e 2 ) is such that for every instantiation of the variables by generator constants, 
e t simplifies to a generator constant The left hand side of every equation we wish to check is 
of the form 36(F(g r .... g^), where F is an implementing function symbol, and g r . . . , g n 
are generator expressions. Note that 3G(F(gj, . . . , g n )) reduces to %%(g v . . . , gj) by the 
DG-rule corresponding to F. The latter expression satisfies the desired condition since f and % 
are well-spanned by PW. 

There are several situations when the method described above is not applicable for 
proving an equation e t s e 2 . But there exists another equation e J s e \ such that 



18. Note that if a function f is well-spanned by PW, then every term of the form f(tj, . . . , t^, where 
tj, . . • , \ are generator terms, can be simplified to a generator term using PW. 
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(1) e J = e \ can be proved using the above method, * 

(2) e x = e 2 is a theorem if e J = e J is a theorem, and 

(3) e J = ej can be derived automatically from e ( s e r 

In other words, ej = e» is serving as a lemma for the theorem e t =s e r The 
procedure Is-an-inductive-theorem-of consists of transforming e t = e 2 to ej = ej, and then 
applying Can-be-made-convergent on ej == ej. The transformation of e x = e 2 to e J = ej is 
performed by a function I, called the lemma deriving Junction. The lemma deriving function 
used by Is-an_inductive-theorem-of is defined below: 

The Lemma Deriving Function (JL) 

L is a function on expressions. I can be used to derive for a given equation e l == e 2 a lemma 
that the proof of the former is dependent on. The two sides of the lemma are obtained by 
applying L to e t and e 2 . 

JL: expression -> expression 

Usage: L(a^ 

Pre: a x is of the form Dt(a 2 ), where a 2 does not contain the symbol %. 

Returns: An expression fi that is obtained by replacing in a^ every subexpression of 
the form DG(d), where d is any terminal, by a new terminal dy, 



We will now illustrate the procedure Is-an-inductive-theorem-of to check if the 
equation DG(APPEND(c, Insert^/ ))) s 36(APPEND(ENQUEUE(c, i% d» is a theorem of 
PW being used in our example. The equation was obtained in step (6) while deriving a 
synthesis equation in the previous section. 

Equation to be checked: 36(APPEND(c, Insert^/ ))) s D€(APPEND(ENQUEUE(c, i), d». 

Step(l) Derive Lemma by applying!: 

(a) Simplify both sides, 

(b) Replace 36(c) by q, %(d) by R, %(i) by i 
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f 3G(APPEND(c, Insert^, ?))) 36(APPEND(ENQUEUE(c, i), dj) 

\ \ 

Append(D€(c), Add_at_head(3G(d), %(i))) Append(Enqueue(!}G(c), %(i)), %(d)) 



Lemma to be checked: Appcnd(q, Add_at_head(R, i)) = Append(Enqueue(q, i), R) 

Step(2) Check if critical pairs are convergent: 
v (a) Critical pair determined by Rule (16): 

Appcnd(q, ad(Lat_hcad(Nullq, j)) 



Appcnd(Enqucue(q, j), Nullq) Appcnd(q, Enqucuc(Nullq, j)) 

1 1 

Enqueuc(q,j) Enqucue(q,j) 
(b) Critical pair determined by Rule (17): 

Append(q, add_at_hcad(Enqueue(r 1 , j,), j» 



Append(q, Enqueut<add_at_hcad(r , j), j )) Append(Enqucuc(q, j), Enqueue^., j,)) 

1 t 

Enqueuc(A(.pcnd(Enqucuc(q, j), fj), jj) Enqueue(Appcnd(Enqueuc(q, j), r,), j,) 
4.4.3.3 An Instantiation for the Synthesis Equation 

Here, we describe a method of finding a substitution a that determines the left hand 
side of the instance of the theorem we wish to generate. Note that the left hand side of the 
theorem is already known to us which in the current example is 36(APPEND(c, Insert^ /))). 
a maps the terminals in the left hand side expression to suitable expressions, a should be 
chosen so that the equation a(D6(APPEND(c, Insert^ /)))) == <r(36(?e 2 )) is in the equational 
theory of PW. This implies that a should be such that o(36(APPEND(c, Insert^ /)))) and 
a(D£(?e 2 )) have the same normal form. Note that 3£(?e 2 ) is unavailable to us at the moment 
So, a has to be determined from the left hand side expression alone. Since the theorem 
D£(APPEND(c, Inserts *))) s %(7e 2 ) is not necessarily in the equational theory of PW, an 
arbitrary substitution that maps terminals to generator terms cannot be used. 

The following fact about our proof method (for inductive properties) serves as the 
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basis for the method of finding a. The basis step of the inductive proof can always be carried 
out using the equational logic. So, we choose the a that corresponds to a basis step of the 
proof of the lemma The instantiation corresponding to the basis step can be determined 
automatically starting from the left hand side of the theorem alone. 

Finding such a a involves two stages because the proof of the theorem, as you may 
recall, involves two stages: Converting the theorem to the lemma, and then proving the 
lemma itself. We first determine a substitution «■ that corresponds to a basis step of the proof 
of the lemma, a is determined from « using the method used by the lemma defining 
function L to convert the theorem to the lemma We describe the two steps below. 

Step (1) Determination of u 

(a) Find the left hand side of the lemma. 

This is obtained by applying I, the lemma defining function, to the left hand side of 
the theorem. For our example: Left hand side of the theorem is 
36(APPEND(c, Insert^ /))). To obtain the left hand side of the lemma we simplify 
the expression, and replace every subexpression that has % at the root by a new 
terminal: 3€(APPEND(c, Insert^ /))) -** Append(D6(c), Add_at_head(3£(<i), 3€(/))). 
So the left hand side of the lemma is Append(q, Add_at_head(R, i)). 

(b) Find a basis step in the proof of the lemma 

For this, compute all the superpositions between the left hand sides of the rules of 
PW and the left hand side of the lemma Simplify the superpositions. A sufficient 
condition for a superposition to correspond to a basis step is that its normal form is a 
generator expression. The most general unifier that determines such a superposition 
is a candidate a. The following table gives the result of performing the above steps 
on the current example. The columns, in order, give the rewrite rule in PW 
responsible for the superposition, the superposition, and the normal form of the 
superposition. The first superposition in the list simplifies to a generator expression. 
Therefore, u is the most general unifier corresponding to the first superposition, 
which is {R •-♦ Nullq}. 
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Rule Superposition (Superposition)^ ■? 

(16) Append(<j, Add_at_head(Nullq, /)) Enqueue^, /) 

(17) Append(g, Add_at_hcad( Enqueuc(Append(g, 

Enqueuctrj.jj), 0) Add_at_head(r x , /)), j t ) 

Step (2) Determine a from to 

w provides instantiations for the terminals in the left hand side of the lemma, a instantiates 
the terminals in the left hand side of the theorem. Our objective is to find a a so that when 
the left hand sides (of the lemma and the theorem) are instantiated by a and w, respectively, 
they simplify to the same expression. 

For instance, in the current example, the left hand side of the theorem is 
e x = D€(APPEND(<vInsert(4 /))), whose normal form is 

e 2 = Append(D6(c), Add_at_head(DG(^), D€(/))). The left hand side of the lemma is 
e 3 = Append(q, Add_at_head(R, i)), which was obtained by replacing %(d) by r, and %(c) by 
q. w maps r to Nullq, and leaves the rest of the terminals unchanged. Therefore, a should 
map d to an expression such that Nullq = %(d) is a theorem in the equational theory of PW. 
Therefore, the instantiation for d can be determined using the first two synthesis rules by 
generating a theorem that has Nullq on the left hand side, and an expression of the form 
36(?e) on the right hand side. The generation sequence is shown below. The first theorem is 
obtained by invoking Synthesis Rule(l) for the expression Nullq. The second theorem is 
obtained by using Synthesis Rule (2); rewrite rule (14) of PW is used for expand. The right 
hand side, DG(Create), of the theorem generated determines a as {d*-+ Create}. 

Nullq = Nullq 

= DG(Create) 

4.5 An Abstract Implementation of the Derivation Procedure 

Below, we give an implementation for a procedure Generate-a-ruie. The procedure 
determines a suitable right hand side expression for a rewrite rule in a partial preliminary 
implementation given the left hand side expression. The procedure also expects a Perturbed 
World and a termination ordering as inputs. The procedure is implemented in a high level 



93- 



•■ algorithmic language whose semantics is self-explanatory. 

The implementation assumes that there exist two procedures 
I Is-an-inductive-theorem-of and A-suitable-instantiation-for-lhs. The latter finds a suitable 
substitution that determines the instance of synthesis equation to be generated. 

The procedure performs essentially the theorem generation illustrated before in a 
systematic fashion. Roughly, it operates as follows. It finds the instance of the left hand side 
of the synthesis equation by applying A-suitable*instantiation-for-lhs to Dfi(lhs). It simplifies 
this expression to its normal form. The normal form is then expanded repeatedly using 
appropriate rewrite rules of PW until a suitable right hand side is encountered. 

The nontrivial aspect of the procedure concerns performing expansion in an 
effective fashion. There are two problem areas. Firstly, expansion is not uniformly 
terminating. That is, expansion is a potentially nonterminating activity. The procedure uses 
the termination ordering >- to circumvent this problem. The right hand side has to be an 
expression that is less than the given left hand side. But, expanding an expression always 
gives rise to a bigger expression in the ordering >-. Thus, the procedure can be terminated 
the moment we encounter an expression that is not less than the left hand side. (Note that the 
>- is such that there can only be a finite number of expressions less than any given 
expression.) 

Secondly, expansion is not uniquely terminating. That is, an expression can be 
expanded in several different (but finitely many, because there are only finite number of rules 
in PW) ways using the rules in PW. All of them do not necessarily lead to the same final 
expression. Some of them may not even lead to a suitable right hand side expression. In the 
examples illustrated earlier, the rules of PW were carefully chosen so that they resulted in the 
desired right hand side. A working implementation, however, is forced to keep track of all 
possible expansions since any one of them can result in the desired right hand side. In the 
implementation given below the variable S is used for this purpose. 

This chore, in fact, happens to be the main source of inefficiency in the synthesis 
procedure. We use the following obvious ways of getting rid of unproductive expansion 
paths. Firstly, type information is used to eliminate some of the candidate rewrite rules for 
expansion. Secondly, expansions that result in an expression that is not less than the left hand 
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side are not going to be fruitful. Finally, we make a distinction between the variables that 
appear in the rewrite rules of PW, and the ones in the given left hand side. The latter, which 
are terminals, are treated as constants. This eliminates several rewrite rules, for expansion that 
are candidates otherwise. 

It should be noted that the procedure given below is only a part of a complete 
implementation of the synthesis procedure. The other part is expected to determine the left 
hand side of the rules. We have assumed that there exists a procedure to determine the left 
hand sides. If the following procedure does not succeed in finding a suitable right hcnd side 
for a given left hand side, then another set of left hand sides have to be generated, and the 
following procedure reexecuted. 



Generate-a-rule = proc ( PW: Perturbed World, lhs: F(g r . . . , g^), 

>-: ordering) returns (Rewrite Rule) 

^Initialization 

a: Substitution *- A-suitable-instantiation-foHhs 

ilhs «— ff(lhs) 

S «- {D€(ilhs)i} 

repeat 

%Test if expansion can be stopped 

if There-exists-a-suitablc-candidatc-in(S) 

then rhs «— Fetch-a-suitable-candidate-from(S) 

return(lhs -+ rhs) 

endif 

%Ifa candidate has not been generated yet, expand by one more step 

Sl<-$ 

for every t € S do 

SI <- SI U sct-of-all-expansions-of t by PW 

endfor 
S ♦- SI 

%Dropfrom SI unproductive expressions 
for every t € Sj do 

if ~(lhs >- 1) then SI «- SI - {t} 

forever 
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%Subprocedure description 

Thcre-exists-a-suitable-candidate-in: subproc (S: Setf Expression]) returns (Boolean) 

if3t€Ssuchthat 
3 D6(F(gj, . . . , g H ) = 3€(?rhs) € Genplhs = t] such that 

(1) ?rhs does not contain % .or operations of the implemented type, 

(2) F(g lt . . . , g^) >- ?rhs, and 

(3) Is-an-inductivc-theorcm-of-PWWFCgj, . . , , g B ) = UG(?rhs)) 
then rcturn(True) else return(Faise) 

end subproc 



%Subprocedure description 

Fctch-a-suitablc-candidatc-from: subproc (S: Set[ExpressionJ) returns (Expression) 

if3t€Ssuchthat 

3 %(¥(g l , . . . , g^) = D€(?rhs) € Gcn[ilhs s t] such that 

(1) ?rhs does not contain % or operations of the implemented type, 

(2)F(g,,...,g n )X?rhs,and 

(3) Is-anMnductive-theorem-of-PW(D6(F(g 1 g.) s 3fi(?rhs)) 

then return(t) 
end subproc 



end Generate-a-rule 



set-of-all-expansions-of_by: Expression X Rule -> Set[Expressk>n] 

Usage: set-of-all-expansions-of t by y — ♦ 8 

Returns: Returns the set of all possible expansions of a given term via a given rule. 

set-of-all-expansions-ofjy: Expression X Set[Rule] -> Set(Expression] 
Usage: set-of-all-expansions-of t by 9b 
Returns: The set of all terms s such that 

s = U set-of-all-expansions-of t by R, for all R € 9b 



expand_in_by: Expression X Occurrence X Rule -> Expression 
Usage: expand t x in u by y -+ 8 
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Pre: Varset(tj) n Varset(-y) = $ %For convenience 

tj/« is-unifiable-with 8 

Returns: expand t t in « by y — * 8 yields a term t 2 such that every term that reduces (in u by y -+ 6) 
to an instance of t^ will be an instance of t r In other words t 2 is the most general instance of 
all the terms that reduce (in u by y -* 8) to an instance of t v Note that the result the 
function returns is unique upto permutations of the variables. This is because a, which is 
the most general unifier of two terms, is always unique when restricted to the variables in 
the two terms t t and 8. 



cxpands-to_in_by: Expression X Expression X Occurrence X Rule •> Bool 

Usage: t t expands-to t 2 in u by y — ♦ 5 
Pre: VarsetCy) n Varsetftj) = $ 

Returns: A predicate that tests if a term expands to another given term, 
(tj/w) is-unifiable-mth 8 A t 2 = expand t x in u by y — ► 8 
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5. Extending the Derivation Problem 

The derivation problem and the derivation procedure described in the last chapter 
apply to a situation in which the representing domain (&) for the desired preliminary 
implementation is unrestricted. That is, % includes all the values of the representation type. 
This section extends the problem to the more general situation where % is a subset of the 
value set of the representation type. 

% contains the set of values that are permitted to be used by a preliminary 
implementation for representing the values of the implemented type. It is characterized by 
the association specification supplied by the user. Suppose J. and 3 are the abstraction 
function and the invariant specified by the association specification respectively. Then % is 
the set ofall values for which 5 is true. The present situation is one in which 3 is true on only 
a subset of the representation value set 

For instance, consider the association specification given in Fig. 15. This example 
will be used to illustrate the procedure described in the chapter. It specifies an 
implementation of Queuejnt interms of Array _Int X Integer X Integer. The abstraction 
function jL can be described informally as follows. Nullq can be represented by any triple in 
which both the integer components are equal. A nonempty queue can be represented by a 
triple <v, i, j>. v is an array of arbitrary length containing the elements of the queue, in order, 
between the index values i and j-1. In other words, i points to the front end of the queue, and 
j points to the next availabe position in v for adding a new element into the queue. The 
invariant 3 is true on all triples such that i < j and the array is guaranteed to be defined on all 



Fig. 15. Queuejnt in terms of Triple 

U(<v, i, i>) s Nullq 

JH< Assign(v, e, j), i, j + 1>) = if i = j + 1 then NuUq 

else Enqueue(.X(<v, i, j>), e) 

3(<v, i, i>) = True 

3(< Assign(v, e, j), i, j + 1>) = if i = j + 1 then True 

else if i < j+ 1 then 3(<v, i, j>) 

else False 
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* index values between i and j. 

5.1 Characterization of the Problem 

The criterion of correctness (stated in the previous chapter in Sec 4.2.1) that was 
used to characterize the problem earlier is applicable in the current situation as well. For 
convenience, we repeat the criterion below: A preliminary implementation of a data type is 
correct with respect to an association specification (that characterizes an abstraction function 
X and a representing domain <&) if the following properties hold. 

(1) Totality Property: Every implementing function is total over 9b. 

(2) Homomorphism Property: The implementing function F and the operation f of the 
implemented type are related by the following homomorphism property: 

(V r € gb)[D€(F(..., r ,...)) = f(..., %(r) ,...)], where % is a function defined as: 
%(r) as jL(t) if r£% 
r otherwise 

Based on the above criterion, the derivation of a preliminary implementation was 
viewed earlier as a problem of finding a set of rewrite rules PI so that PI U IW and PI U PW 
satisfy the principle of definition. We still view the problem the same way. But, now the 
implementing functions need be defined only on the values in Sfc, and the homomorphism 
property need only be verified on the values in &. This means that PI U IW and PI U PW 
need satisfy the principle of definition only with respect to a subset of the set of all 
generator constants of the representation type. This subset is the representing domain of 
constants T characterized by the association specification as follows: T = { t j 5(t) s True}. A 
proof of the claim that if PI U IW and PI U PW satisfy the principle of definition with 
respect to T, then PI is correct can be carried out along the same lines as the proof of the 
Correctness Theorem (Sec. 4.2.2). The proof for the present case can be obtained by 



19. A system S satisfies the principle of definition with respect to T if the every constant of the form 
F(g r • • • > g„). where F is a nongenerator function symbol and g v . . . , g,, are generatore constants in 
T, has a unique normal (in S) that is a generator constant in T. 



99 



systematically replacing in the earlier proof the phrase "the principle of definition" by the 
phrase "the principle of definition with respect to T". 

5.2 Derivation of a Preliminary Implementation 

First we formulate the synthesis conditions that are used as a guide in the derivation 
of a preliminary implementation, and then describe a procedure to derive a set of rewrite 
rules PI that satisfies the synthesis conditions. The synthesis conditions are sufficient to 
ensure that PI U IW and PI U PW satisfy the principle of definition with respect to T. 

5.2.1 The Synthesis Conditions 

The synthesis conditions for a preliminary implementation PI are the following: 

(1) Totality Condition: 

(a) PI is well-spanned with respect to T (for every implementing function) 
with every rule in it being of the form F(g r . . . , g^ -+ 1, where F is an 
implementing function symbol, and g v . . . , ^ are generator expressions. 

(b) PI has the uniform termination property. 

(2) Uniqueness Condition: PI has the unique termination property. 

(3) Homomorphism Condition: For every rule F(g r . . . , g^ — » t in PI, 
%!> A ... A 5(g n ) 20 => 36(F(g r . . . , gn )) = %(t) is a theorem of PW. 

(4) Invariance Condition: For every rule F(g r . . . , g^ -+ 1 in PI, where the range of F 
is the representation type, 5(g x ) A ... A 3(g n ) => 3(t) & True is a theorem. 

It is interesting to note the effect of the presence of the invariant 5 on the synthesis 



20. Here, we assume that each of the expressions g p .... g^ is of the representation type. If not, the 
antecedent would consist of a conjunction of 5 applied to only those expressions among g r . . . , g„ 
that are of the representation type. The same qualification applies to condition (4), as well. 
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conditions. The Totality Condition and the Uniqueness Condition remain as before, and 
serve the same purpose: The Totality Condition ensures that an implementing function is 
defined and terminates on every value in the representing domain. The Uniqueness 
Condition ensures that an implementing function yields a unique value on every argument 
The Homomorphism Condition, which ensures that every implementing function satisfies the 
homomorphism property, now requires that DGCFfej, . . . , g^) = %(t) be a theorem only 
under the assumption that the arguments to F satisfy 5. The Invariance Condition imposes an 
additional constraint on the expression that may appear on the right hand side of a rule: It 
ensures that every implementing function preserves 5. The Synthesis Theorem to follow 
shows that when PI satisfies all the synthesis conditions PI U IW and PI U PW satisfy the 
principle of definition with respect to T. 

The Synthesis Theorem 

Theorem 2 Let PI be a set of rewrite rules that satisfies all the synthesis conditions. Then, 
PI U IW and PI U PW satisfy the principle of definition with respect to T, where T is the 
representing domain of constants characterized by the invariant 5. 

Proof Appendix III 

5.2.2 Deriving the rules of PI 

The derivation PI follows the same general pattern as before. The first task is to 
construct the PW which is done as before by combining the specification of the implemented 
type, the homomorphism specification, and any desired parts of the specifications of the 
implementing types. The homomorphism specification is derived from the abstraction 
function specification as before (sec. 4.2.2), For instance, PW for the example under 
consideration is given in Fig. 16. Note that PW does not contain the invariant specification. 
The infomation pertaining to the invariant will be maintained as a different entity. This will 
be explained shortly. 

The rules of PI are derived so that every synthesis condition except the Uniqueness 
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* Fig. 16. The Perturbed World. 

(I) Front(Nullq) -» ERROR 

1 (2) Front(Enqucuc(NuUq, e)) -► e 

(3) Front(Enqueuc(Enqueue(q, el), e2)) -* Front(Enqueue(q, cl)) 

(4) Dequeuc(Nullq) -► ERROR 

(5) Dequeuc(Enqucue(Nullq, e)) — ♦ NuIIq 

(6) Dequeuc(Enqucue(Enqueuc(q, el), c2)) -♦ Enqueue(Dcqueuc(Enqueuc(q, el)), e2) 

, (10) Append(q, Nullq) — ► q 

(II) Append(ql, Enqucuc(q2, c2)) -> Enqueue(Appciid(ql, q2), e2) 

(12) Empty(NuIlq) -♦ True 

(13) Empty(Enqucuc(q, c)) -+ False 

(14) 36(<v, i, i>) -» Nullq 

(15) %(< Assign(v, e, j), i, j + 1>) -» if i = j + 1 then Nullq 

else Enqucue(D€(<v, i, j>), 06(e)) 

(16) K(NULLQ0) -► Nullq 

(17) D£(ENQUEUE(c, i)) -» Enqucuc(Dfi(c), D€(i)) 

(18) Dt(DEQUEUE(c)) -► Dequeue^)) 

(19) 3G(APPEND(cl, c2)) -♦ Appended), M(c2)) 

(20) DG(EMPTY(c)) -+ Empty(3fi(c)) 

(21) 36(if_then_else(b, v r Vj)) -» if_then_else(b, 3G(v,), 36(y 2 )) 



Condition is met The procedure derives the preliminary implementation for one operation 
at a time by deriving a separate set of rewrite rules for every operation. The method used is 
the same for every operation. The procedure first determines the left hand sides of all the 
rules to derive a partial preliminary implementation. Then, it determines a suitable right 
hand side for each of the rules in the partial preliminary implementation. 
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5.2.2.1 Determining the Left Hand Side 

The technique used for determining the left hand sides is the same as before 
because the Totality Condition, which is used for the purpose, is the same as before. The left 
hand sides are derived so that the set of expressions appearing as arguments to every 
implementing function is well-spanned. 21 Fig. 17 gives a possible set of left hand sides for a 
preliminary implementation for the example under consideration. As before, we use the 
question mark identifiers as place holders for expressions to be determined yet 



Fig. 17. A Partial Preliminary Implementation 

Representation 

Array Jnt X Integer X Integer 

Definitions 
NULLQO -» ?rhs, 

ENQUEUE(<v, i, j>, e) -* ?rhs 2 

FRONT(<v, i, i>) -> ?rhs 3 

FRONT(< Assign(v, e, j), i, j + 1>) -» ?rhs 4 

DEQUETJE(<v, i, i>) -► ?rhs 4 
DEQUEUE(<Assign(v, e, j), i, j+ 1>) -+ ?rhs s 

APPEND(<vl, il, jl>, <v2, i2, i2>) -» ?rhs 6 

APPEND(<vl, il, jl>, <Assign(v2, e, j2), i2, j2 + 1>) -♦' Trh^ 

EMPTY(v,i,j>)-+?rhs g 



21. Note that if a set is well-spanned, then it is well-spanned with respect to any set of generator 
constants. 
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5.2.2.2 Determining the Right Hand Side 

The general strategy used to derive the right hand sides is the same as before. They 
are derived so that the Homomorphism Condition, the Invariance Condition, and the second 
part of the Totality Condition (which is left unensured while determining the left hand side) 
are ensured. The right hand side of a rule is determined by deriving a synthesis equation 
corresponding to the rule. A synthesis equation corresponding to a rule F(g p . . . , g^ -+ ft is 
an equation of the form 36(F(g r . . . , g n )) = Dt(qt) that satisfies the following conditions: 

(1) 3( gl ) A ... A 5(g n ) => 3G(F(g r . . . , g„)) = 3G(?t) is a theorem of PW. 

(2) If the range type of F is the representation type, then 
3(gj) A ... A Sigj => 3(t) s True is a theorem of PW. 

(3) F(g r . . . , g^ >- ?t, >- is the termination ordering on expressions. 

(4) It may only contain only the permitted operation symbols of the implementing 
types, and the implementing function symbols. 

Note that the synthesis equations have additional constraints here because of 3. So, 
the derivation of the synthesis equations is going is going to have to be performed slightly 
differently. This is the topic of the next section. 

5.3 Deriving the Synthesis Equations 

The general strategy used for deriving a synthesis equation is the same as before. 
That is, we generate a series of theorems of PW until we encounter one that qualifies to be a 
synthesis equation. We use the same pair of synthesis rules for generating the theorems of 
PW. The only difference lies in the set of rewrite rules used for expansion while generating 
the theorems. Earlier, the rewrite rules in PW were used. But now, it is necessary to use an 
additionalset of rewrite rules. 

There are two reasons for this. Firstly, a synthesis equation 
Dfi(F(g r . . . , g^) == %(7t) to be derived is a theorem of PW in a special context: A context 
determined by the fact that g v . . . , g n satisfy the invariant 3. In deriving the synthesis 
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equations, one has to use rewrite rules describing this context besides the rewrite rules in PW. 
Secondly, ?t has to be determined so that 3(?t) = True is a theorem. For this, it is necessary 
to use the rewrite rules in the specification of 5. These additional rewrite rules, which 
describe information pertaining to the invariant, are maintained as a separate entity called the 
Temporary World (TW). We will discuss more about TW- its composition, and its 
construction - later. It is sufficient to say the following at this point: TW consists of rules that 
specify 5, and rules that assert that g r . . . , g n satisfy the invariant The rules in TW are used 
for expansion as well as to ensure that ?t satisfies 5. 

It should be noted that part of the Temporary World used in the derivation of a 
preliminary implementation could be different for different rules in the preliminary 
implementation. This is because the argument expressions appearing on the left hand side 
(& v • • • ♦ g„) are usually different for different rules. Consequently, the part of TW that 
changes has to be constructed afresh at the beginning of the derivation of every rule. (The 
temporary life time of a part of TW is what prompted us to name TW a Temporary World.) 

5.3.1 A Simple Illustration 

In the following, we show the derivation of a synthesis equation corresponding to 
the rewrite rule ENQUEUE(<v, i,j>, e)-> ?rhs 2 in the partial preliminary implementation 
shown in Fig. 17; The derivation provides an illustration of how the generation of theorems 
is influenced by TW. It also illustrates for the first time performing expansion using rewrite 
rules that have conditional expressions in them. 

The TW used for the derivation is shown below. For ease of reference, also given 
below are rules excerpted from PW (Fig. 16) that are relevant in the present derivation. 
Rules numbered (9) and (10) in TW are the specification of 5. The rule numbered (11) asserts 
that the argument <v, i,p to ENQUEUE satisfies 5. The fourth rule is a property of the 
invariant: Any triple <v, i,p that satisfies 3 is such that i<,j. This can be proved as a theorem 
from the specification of 5. We will see how this is obtained in a subsequent section where we 
discuss more about the Temporary World. 

The Relevant Rules of PW 
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(1) D6(<v, i, i>) -+ Nullq 

(2) %(< Assign(v, e, j), i, j + 1>) -uf i = j + 1 then Nullq 

else Enqueuc(t>G(<v, i, j>), %(c)) 

' (3) Dt(ENQUEUE( X , e)) — Enqueue(DG(x), D€(e)) 

(4) if_then_elsc(False, vl, v2) — ► v2 

(5) if_then_else(True, vl, v2) — » vl 

(6) 3£(if_then_elsc(b, vl, v2)) -► if_then_dsc(b, OG(vl), 3G(v2)) 

(7)x = y+l->not(x<y) 

(8) not(Truc) -► False 

The Temporary World 

(9) 5(<v, i, i>) -» True 

(10) 5(C Assign(v, e, j), i, j + 1>) -♦ i < j + 1 A [i = j + 1 V 5(<v, i, j>)l 

(ll)5«v,4./>)->Tnie 
(12) i<j-* True 



Shown below is a generation of a series of theorems by invoking the synthesis rules 
using the rewrite rules shown above for expansion. The generation results in the derivation 
of a synthesis equation of the form we desire. The first theorem in the series is obtained by 
invoking Synthesis Rule(l) for the expression D6(ENQUEUE(<v, i,J>, e))\ the normal form 
of this expression is Enqueue(3G(<v, i,p), %(e)). The rest of the theorems in the series are 
obtained by invoking Synthesis Rule (2) using different rules in PW and TW for expansion. 

An explanation about our choice of the rewrite rules for expansion in the following 
derivation is in order. Recall that the ultimate objective of expansion is to drive the symbol 
% in the right hand side of the equation in Step (1) to the outermost level of the expression. 
Inspection of the rules of PW reveals two possible sets of rules which could be used for this 
purpose. The first one is the DG-rules, in particular, Rule (3) of PW; however, applying this 
rule in Step (1) will yield an expression identical to the one on the left hand side which is not 
acceptable. The other possibility is applying the rules of the homomorphism specification, 
i.e., either Rule (1) or (2) of PW. Rule (1) is clearly not applicable. Rule (2) is also not 
applicable. A closer look, however, reveals that Enqueue(36(<v, /,/>), 36(e)) has the form of 
the expression in the else-arm of the conditional expression on the right hand side of 
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Rule (2). Hence, we make an attempt to expand Enqueue(Dt(<v, /,;>), 3G(e^) to an expression 
of the form iUhen_else(..., ..., Enqueue(D6(<v, Up), 36(e))). The manipulations performed in 
Steps (2) through (4) are precisely aimed at this. 



Form of synthesis equation to be derived: Dt(ENQUEUE(<v, Up, e)) 
Normal form of D6(ENQUEUE(< v, 4 P, e))\ Enqueue(D€(< v, /, p), %(e)) 
Rules used for simplification: 

Step (1) Invoke Synthesis Rule (1) on DG(ENQUEUE(<v, UP, e)) 
D6(ENQUEUE(< v, UP, e)) = Enqueue(Dfi(<v, UP), 36(e)) 



Step (2) Expand Expression: Enqueuc(D6(<v, UP), 36(e)) 
Using Rule: (4) 

D6(ENQUEUE(<v, UP, e)) = if False then vl else Enqucue(36(<v, 4;>), 36(e)) 



Step (3) Expand Expression: False 
Using Rule: (8) 

36(ENQUEUE(<v, 4;>, <?)) s if -(True) then vl else Enqueue(36(<v, UP), 36(e)) 



Step (4) Expand Expression: True 
Using Rule: (12) 



D6(ENQUEUE(<v, UP, e)) = if not(/ <;) then vl else Enqueue(3G(<v, UP), 36(e)) 



Step (5) Expand Expression: ~(i < j) 
Using Rule: (7) 

D6(ENQUEUE(<v, UP, e)) a if i r = j'+l then vl else Enqueue(36(<v, UP), 36(e)) 



Step (6) Expand Expression: if i = j+ 1 then vl else Enqueue(D6(< v, U P), 36(e)) 
Using Rule: (2) 

D6(ENQUEUE(<v, UP, e)) m 36(<Assign(v, e,j), Uj+i>) 
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Note that the right hand side of the last theorem in the above series is 
such that 

! ENQUEUE(<v,i,y>, e)>~<\ss\gr>{v,e,j),i,j+l> 

5(< Assign( v, e, j), i, j+ 1>) — ► * True 

' Hence, we have the following preliminary implementation for ENQUEUE: 
ENQUEUE(<v, UP, e) -* <Assign(v, e, y), U j+ 1> 

Let us, for a moment, draw the attention of the reader back to steps (2) through (4) 
in the above derivation. Their aim was merely to expand Enqueue(3€(<v, /,/>), %(e)) to a 
conditional expression that had the former expression as its else-arm. The purpose of such a 
transformation was to make it possible to apply (for expanding) a rewrite rule that had a 
conditional expression on the right hand side. A situation such as this is encountered 
commonly during the generation of theorems. This is especially so when the rules of the 
input specifications have conditional expressions in them. Hence it is useful to extend the 
definition of the mechanism expand so that rewrite rules with conditional expressions on their 
right hand side can be applied directly to an expression that is not a conditional expression. 
We describe the extension below. In future illustrations of the derivation of synthesis 
equations, we will be using the extended version of expand. 

Suppose ej -*■ if Jhen_else(b, e 21 , e 22 ) is a rewrite rule, and a is an expression that is 
being expanded by using the former rule. According to the existing definition of expand, the 
following protocol is used for expanding a: 
Protocol 1: 

(1) Check if a (or a subexpression in it) is unifiable with if_then_else(b, e 21 , e 22 ); if so, 
let be the most general unifier. 

(2) Replace 0(a) (or the subexpression in it) by tf(ej) 

Note that according to the above protocol a is expandible only if a (or a subexpression in it) 
is of the form iLthen_ebe(...). Now, we introduce two additional ways in which the rule can 
be used for expansion. 
Protocol 2: 
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(1) Check if a (or a subexpression in it) is unifiable with e 21 ; if so, 1 let be the most 
general unifier. 

(2) Check if 0(b)-* True, or ^(0(b))-* False. 

(3) If so, replace 0(a) (or a subexpression in it) by 0(ej). 

Protocol 3: 

(1) Check if a (or a subexpression in it) is unifiable with e 22 ; if so, let be the most 
general unifier. 

* 

(2) Check if 0(b)-* False, or -(0(b)) -* True. 

(3) If so, replace 0(a) (or a subexpression in it) by 0(ej). 

Using Protocol 3, the preliminary implementation of Enqueue derived earlier can be 
obtained in just two steps as shown below. The theorem in step (1) is obtained as before. The 
theorem in the second step is obtained by using Rule (2) of PW for expansion under 
protocol (3). Note that the boolean expression under consideration is / = y'+l; 
i = j+l-4* False by Rules (7), (12) and (8). 

Form of synthesis equation to be derived: D€(ENQUEUE(< v, U p, e)) 
Normal form of 3G(ENQUEUE(<v, UP, e)): Enqueue(D6(<v, UP), 36(c)) 
Rules used for simplification: 

Step (1) Invoke Synthesis Rule (1) on 0G(ENQUEUE(<v, i,p, e)) 
D€(ENQUEUE(<v, UP, e)) s Enqueue(D€(<v, UP), K(e)) 

Step (2) Expand: Occurrence: X . 

Expression: Enqueue(D€(<v, UP), %>(e)) 
Using Rule: (2), Protocol 3 

Dfi(ENQUEUE(<v, UP, e)) = 36<<Assign(v, e,j), 4;+l>) 

It should be pointed that the addition of protocols (2) and (3) does not enhance the 
generality of the original definition of expand. In other words, we can show the following: 
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Suppose /? can be obtained from a in a finite number of expansion steps using a rewriting 
system R under protocols (1), (2) and (3). Then, fi can also be obtained from a in a finite 
number of expansion steps using only protocol (1), provided R contains the following rules 
that specify if_then_else: 

lf_then_else(True, v r v 2 ) -» Vj 

if_then_else(False ) v r v 2 ) -* v 2 

The reason for introducing protocols (2) and (3) is to reduce the number of 
expansion steps needed in the generation of theorems. The two rules of if_then_else given 
above make expansion uneconomical because the right hand side of each of them is a 
variable. This makes each of them a candidate for being used for expansion at every step of 
the theorem generation process. Use of protocols (2) and (3) in effect limits the use of the 
above two rules to cases where there is a rewrite rule with an if_then_else in its right hand 
side, and which could be used for further expansion. 

5.3.2 More on the Temporary World 

5.3.2.1 The Purpose of TW 

The Temporary World (TW) serves two purposes: Firstly, it holds information 
about the invariant 3. Secondly, it provides a means of keeping a log of certain assertions that 
are needed for temporary stretches during the course of the derivation of an preliminary 
implementation. Some of these assertions are generated automatically by the procedure; 
others are supplied by the user. 

The information about 5 and the assertions are entered into TW as rewrite rules. 
(The derivation procedure may use the rules in TW for expansion like the rules of PW, the 
Perturbed World.) The assertions needed may change during the course of the derivation of a 
preliminary implementation. Some of the assertions needed can only be determined during 
the course of the derivation. Because of these reasons, TW is treated as a dynamic world, i.e., 
a world that changes during the course of the derivation of a preliminary implementation. In 
contrast, PW keeps a log of the facts needed through the derivation of the entire preliminary 
implementation. 
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There are three reasons why temporary assertions might be needed during the 
derivation. Firstly, the equation %(¥(g v . . . , g n )) = D€(?t) being searched for is a theorem of 
PW only under the hypothesis that the arguments to F satisfy 3. The second reason arises in 
checking if ?rhs satisfies 3, i.e., if 3(?rhs) = True is a theorem. This check has to be 
performed under the hypothesis that the arguments to F satisfy 3. Also, performing this 
check may need the use of the inductive logic. In such a case, it is necessary to set up 
appropriate hypotheses for the induction. 

The third reason for the need for assertions arises while one is attempting to expand 
a subexpression of a conditional expression if_then_ebe(b, e r e 2 ). Under such a situation, we 
may assume that b is False while expanding a subexpression in the else-arm, or that b is True 
while expanding a subexpression in the then-arm. For instance, consider the expression 
if_then_else(/=7+l,e 2 ,Enqueue(3fi(<v, /,;>), D6(ej))). In this case, the subexpression 
Enqueue(D€(<v, ./,/>), 3£(ej)) is expandible by the rewrite rule 
3G(<Assign(v, e, j), i, j + 1>) — If i = j + 1 then Nullq eke Enqueue(%(<v, i, j>), D£(e)) 
only if we make the hypothesis that / = j+ 1 -+* Fake. 

5.3.2.2 Construction of TW 

TW consists of two parts: A static part, and a dynamic part. The static part remains 
unchanged for the entire duration of the derivation of the preliminary implementation. The 
dynamic part may change during the derivation. 

5.3.2.2.1 The Static Part 

The static part consists of information about the invariant 3. It consists of 

(1) A set of rewrite rules that constitute the specification of 3. The specification of 3 
involves other data types which are among the implementing types. We assume that 
the static part contains their specifications also. In the examples we discuss, only the 
relevant rules from these specifications are displayed. 
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(2) A set of rewrite rules that express additional properties about 5. > 

The rewrite rules mentioned in (1), above, can be constructed automatically from 
the association specification. The information in (2) is something the user has the option of 
supplying additionally for deriving a preliminary implementation in the presence of a 
nontrivial invariant This information is needed for the following reason: There are several 
preliminary implementations whose derivation is dependent on lemmas that express 
interesting properties about the invariant Although it might be possible to prove these 
lemmas from the specification of 3, the derivation procedure cannot automatically discover 
the desired lemma. The rewrite rules in (2) specify these lemmas. 

The static part of TW used for the current example is given below. Rules (1) and (2) 
are constructed from the specification of 3 given as part of the association specification in 
Fig. 15. Notice that the right hand side of rule (2) is a simplified version of the right hand 
side of the corresponding equation of the specification of 5. The rules used in the 
simplification are (10), (11), (8), and (4). Rule (3) specifies a property of 3. It asserts that if a 
triple <v, i, j> satisfies 3, then i < j. The property can be proved from the specification of 3 
using the KB-method. Rules (4) through (11) belong to the specification Integer and Bool. 
These rules will be used in the examples that- follow. 

(1) 3(<v, i, i>) -+ True 

(2) 3«Assign(v, e, j), i, j+ 1» - i < j + 1 A [i = j + 1 V 3(<v, i, j>)] 

(3)3(<v,i,j>)=»i^j-*True 
(4)x = yVx<y-+x<y 

(5) True V x -♦ True 

(6) ~x V x -+ True 

(7) ~(x A y) — ♦ ~x V ~y 
(8)xV(yAz)-*(xVy)A(xVz) 

(9) (x A y) =*► y -► True 

(10) if_then_else(b, True, e^bVej 

(11) if_then_else(b, e,, False) -4bAe, 
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. 5.3.2.2.2 The Dynamic part 

This is the part that may change during the course of the derivation of a preliminary 
implementation. It may vary from the derivation of one rule of the preliminary 
implementation to another; within the derivation of a single rule, it may vary from one 
theorem generation step to the next By a theorem generation step, we mean the following: 
Recall that the derivation of a rule involves generating a series of theorems. The generation 
of every theorem in the series is considered as a theorem generation step in the derivation of 
the rule. 

The dynamic part is empty at the beginning of the derivation of every rule of the 
implementation derfinition. Assertions (in the form of rewrite rules) are added to and 
removed from the dynamic part at specific instants during the derivation of a rule. Every 
assertion that is added during the derivation of a rule is removed by the end of the derivation. 
Every time an assertion is added to TW, it is important to ascertain that the addition does not 
render TW inconsistent To ensure consistency, we run the predicate 
Is-an-inductive-theorem-of 22 (see sec.4.4.3.2) on TW every time an assertion is added to TW. 
(Note that TW is convergent to begin with. This is because the static part, which consists of 
the specification of 5, is guaranteed to be convergent) The assertion is added only if the 
Is-an-inductive-theorem-of succeeds. In some cases the is-an-inductive-theorem-of may 
succeed by generating a finite number of new assertions. In several situations it is useful to 
add these new assertions also to TW. If these assertions are, indeed, added to TW, then they 
should also be removed along with the original assertion. 

The assertions in the dynamic part can be classified into two categories based on the 
life time of their existence. We describe the construction of the two categories below. 

Arguments-Assertions 

These assertions are added at the beginning of the derivation of a rule. They remain 



22. We assume that the predicate Is-an-inductive-theorem-of is run iteratively a fixed number of 
times that is finite. 



113 



in TW until the end of the derivation of the rule. We call these assertions 
Arguments- Assertions because they are dependent on the expressions supplied as arguments 
to the implementing function for which the rule is being derived. For instance, if the rule 
being derived is of the form F(g lt . . . , g n ) -» ?t, then the assertions are dependent on 

Arguments- Assertions can be of two kinds: The first kind assert that g v . . . , g n 
satisfy 5. These are entered in TW as the rewrite rules 5(gj) -+ True, .... Sfe^ -+ True. It is 
easy to see that these assertions can be constructed automatically. 

The second kind consist of assertions that are supplied by the user. These are used 
for ensuring that every rule of the preliminary implementation preserves the invariant J, i.e., 
J(gi)A...AJ(g^ => 5(F(g 1 , . . . , g^). The assertions express the induction hypotheses that 
might be needed for checking the above property. The reason that the user might have to 
supply these assertion is the following. Recall that our method ensures the invariance 
property by deriving every rule F(g v . . . , g n ) -> ?t so that J(?rhs) == True is a theorem of TW. 
(Note that TW already includes rewrite rules asserting that g r ...,g n satisfy 5.) If the 
preliminary implementation desired is such that 5(?t) = True can be proved automatically 
from TW using the equational logic or the KB-method for proving inductive properties, then 
no additional assertions are needed. However, if the preliminary implementation desired is 
such that the proof of J(?rhs) = True needs induction hypotheses that cannot be generated 
automatically by the KB-method, then assertions expressing the induction hypotheses have to 
be added to TW. 

The assertions used as induction hypotheses in all our examples are constructed by 
invoking the inference rule given below. The inference rule expresses a general induction 
principle that uses the termination ordering >- as the well-founded partial ordering for the 
induction. Informally, the inference rule can be stated as follows. Suppose F(g r . . . , g^ -+ 
?t is the rule being derived. Then, in trying to ensure 5(F(g t , . . . , g^), we may assume 
3{B(y v . . . , v k )) for any argument <v r ...-,v k > that satisfies 3, and that is "less than" 
<g r . . . , g,> in the ordering >-. 



<g r ...,g n >^<v r - .,v t > 
3(v 1 )A...A3(v k )=>5(F(v 1 ,...,v k )) 
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As an illustration, let us construct a«et of Arguments- Assertions for the derivation of a rule 

for APPEND. We will be using these assertions later when we illustrate the derivation of the 

preliminary implementation for APPEND. Suppose we are attempting to derive a rule of the 

following form: 

APPEND(<v t , ^jf, <Assign(v y e,j 2 ), i v j 2 +l>) - ?rhs 

Then, the Arguments-Assertions may include the following rewrite rules. The first two 

assertions state that the arguments supplied to APPEND satisfy 5. The third assertion is used 

as an induction hypothesis. ' 

5(<» 1 ,/ 1 J 1 >)-»Tme 

5(<Assign(v 2 , e,j 2 ), i 2 ,j 2 +l>) -> True 

3(<v 2 , i 2 ,j 2 >) => 5(APPEND(<v 1 , i v j\>, <v 2 , i 2 J 2 >)) - True 

Cond itional-Expressions- Assertions 

The second category of assertions in the dynamic part is the 
Conditional-Expressions-Assertions. A need for these assertions arises while expanding a 
subexpression of a conditional expression in the generation of theorems. These assertions are 
added to TW at the beginning of a theorem generation step, and removed at the end of the 
step. The Conditional-Expressions-Assertions needed in a step are determined by the 
occurrence of the subexpression that is chosen to be expanded for generating the theorem in 
that step. For instance, suppose the following is the theorem generated in the first step 
during the derivation of a rule for APPEND. 
DtCAPPEND^v,, /j,^, CAssign(v 2 , e,j 2 ), i v j 2 +V») 

= if_then_else(/ 2 = j 2 +l, Enqueue0G( APPEND^, i v j\>, <v p i Y j 2 >% e)) 
Suppose we decide to generate the theorem in step (2) by expanding the subexpression 
D€(APPEND(<v 1 , ipj^, <v 2 , / 2 , y 2 >)) on the right hand side of the theorem in step (1). Then, 
we may add to TW the assertion i 2 = j z + 1 -* Fabe. The reasoning behind the addition of 
this assertion should be apparent by now. The subexpression chosen for expansion appears 
in the else-arm of a conditional expression. Hence, while expanding the subexpression we 
may (if we wish) assume that the corresponding boolean expression is False. In general, we 
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may have to add more than one such assertion in a step because the subexpression could be 
embedded within more than one conditional expression. Suppose a is the subexpression 
chosen to be expanded. Then, the Conditional-Expressions-Assertions for the step are 
determined as follows: 

(i) For every conditional expression if_then_else(b 1 , ...a ), of whose then-arm a is a 

part, add bj -» True. 

(ii) For every conditional expression if_then_else(b 2 , ..., ...a...), of whose else-arm a is a 
part, add b 2 -+ False. 

5.3.3 Preliminary Implementation of Append 

This section derives a pair of synthesis equations corresponding to the two rewrite 
rules in the partial preliminary implementation that define APPEND. It illustrates a more 
interesting utilization of the invariant 3 than was seen in the derivation of the rule for 
ENQUEUE. The derivation also demonstrates how a where construct can be introduced into 
a preliminary implementation, and why it is useful to do so. 

Recall the reason for introducing the where construct into the preliminary 
implementation language: To alleviate the limitation of the constraint that a preliminary 
implementation may not contain any helping functions or observers of the representation 
type. The constraint, in particular, makes it impossible to select the components of a tuple 
returned by an expression that appears on the right hand side of a rule. 

For instance, suppose we wish to construct a triple using the components of the 
triple returned by APPENIHOj, i v j\>, <v 2 » i y j^>). A where construct permits us to do this 
by rewriting the above expression in the following fashion. 

<v, /,/> where <v, UJ> is APPEND^, i v j^>, <v r i 2 ,j 2 >) 
Then, the first argument can be further transformed to construct the desired triple. For 
instance, 

<Assign(v, e 2 ,j), i,j+ 1> where <v, i,j> is APPEND^, i v j^, <v 2 , i 2 ,jj>) 

The new terminals v, i, j introduced should be distinct from the terminals that 
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« already exist in the expression that lis being transformed. It should be noted that a where 
construct can always be eliminated from an expression provided we are permitted to use the 
selector operations of the tuple type; This elimination can also be performed automatically. 

] For instance, the where construct in the above expressions can be eliminated by systematically 
replacing every occurrence of v, /, and j in the first argument to the where construct by the 
following expressions: First(APPEND(<v 1 , i v jj> t <v y i r j 2 >)), SecoiMKAPPEND^Vj, l v j^>, 
<v 2 , ; 2 , ; 2 >)), ThirdCAPPENDC^, i v j % > t <v 2 , i v j 2 >)). (First, Second, and Third are 
operations that select the first, second, and third components of a triple.) 

Below, we give two rules concerning a where construct The rules can be used at any 
step during the generation of theorems to transform the expression on the right hand side of a 
theorem. The first rule specifies how a where construct can be introduced into an expression. 
The second rule specifies how the position of where can be moved within an expression 
without altering its semantics. Suppose 

(1) F is an implementing function whose range is a triple type, 

(2) g is an arbitrary function, 

(3) e, e p ..., e k are arbitrary expressions, 

(4) v, /, j are terminals that do not appear in the equation e = g(..., F(e r . . . , e k ) ,-). 



Where-Rule (1) 



e = gv."i *v®|» • • • » "^z »~v 



Where-Rule (2) 



e s gU, < v, 4 J> where <v, i,j> is F(e r . . . , e k ) ,...) 



e = g( w> <v, 4y> where <v, Ay>is F(e r . . . , e k ) ^) 
e s g(.~, <v, i,j> ,~.) where <v, 4 P is f(e v . . . , e k ) 



A few remarks are in order at this point regarding expanding an expression that 
appears as a subexpression of a where construct Firstly, an instance of a where construct is 
treated, for syntactic purposes, as an application of a function Wherejs with three arguments. 
For instance, <Assign(v, e^j), ij-rl> where <v, 4 j> is APPEND(<v 1 , i v ^>, <v r i y j^) is 
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treated as the expression Where_is(<Assign(v, e v j), i,j+l>, <v, /, j>, APPENDf^, i v j^, 
<v 2 , i y ; 2 >)). Secondly, the second argument to Wherejs may not be expanded; only, the 
first and the third may be expanded. In the above example, for instance, <v, i,j> may not be 
expanded. This is because the second argument to Where Js has to be a tuple of terminals (or 
variables). It does not make sense to have a nonterminal expression as a part of the second 
argument; expansion will introduce a nonterminal expression. 

The third remark concerns the possibility of making temporary assertions while 
expanding subexpressions Of a where expression. Consider the example given above. 
Suppose we decide on expanding the expression <Assign(v, e v j), i,j+l>. The terminals v, /, 
and j in this expression are such that <v, /, j> is acting as a place holder for APPEND^Vj, i v 
j^>, <v 2 , i v y 2 >). If APPEND(< v r i v j^, <v 2 , / 2 , y 2 >) happens to be such that SCAPPEND^Vj, 
i v j^, <v 2 , i v y 2 >)) is True, then we may assume that 3(<v, I, y>) is also True as long we are 
expanding the first argument to the where expression. This assumption may, in general, 
enhance the possibility for expansion. Thus, expansion of a subexpression of a where 
expression may result in an update of the Temporary World (TW). For instance, in the above 
example, if we 5(APPEND(<v 1 , /,,y,>i <v 2 , i 2 J 2 >)) =True is a theorem of TW, then we may 
update TW with the assertion J(<v, £,/>) -♦ True. This is used in the derivation to follow. 

ArgsSet = { Argl: «v,, i lt j\>, <v 2 , ^ * 2 » 

Arg2: « Vj, / 1( ;j>, <Assign(v 1 , e 2 ,; 2 ), * 2 ,; 2 + 1» } 

Relevant Excerpts of the Perturbed World 

(l)3t(<v,i,i>)-»Nullq 

(2) 3G(< Assign(v, e, j), i, j + 1>) -* if_then_else(i = j + 1, Nullq, Enqueuc(3€(< v, i, j>), Dfi(e))) 

(3) D€(APPEND(x, y)) -» Append(3G(x), %(y)) 

(4) Dt(iLthen_else(b, r,. v 2 )) -> if_thea_else(b, ^(v,), 36(v 2 )) 

Derivation of the rule corresponding to Argl 

Form of the theorem to be generated: Dfi(APPEND(<v 1 , /,,./,>, <v 2 , * 2 , j^)) s DG(7rhSj) 
3fi(APPEND(<v 1 ,^j t >,<v 2 ,v^>m: %(<v l ,i l J l >) 
Rules used for simplification: 
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Initial State of the Temporary World 

3(<v, i, i>) -* True 

3(< Assign(v, c, j), i, j + 1>) -> i < j + 1 A [i = j + 1 V 5(<v, i, j>)] 

5(<v,i,j>)=>i<i->True 

3«v 1 ,/ 1 4»-True 



Step (1) Invoke Synthesis Rule (1) on 3G(APPEND(< v,, i v ;,>, < v,, i v » 2 >)) 
3£(APPEND(<v 1 , ^ ;,>, <v 2 , (,, / 2 ») s D6(<v lf {,#) 



APPEND(<v 1 , /,,;,>, <v 2 , r 2 , / 2 » -» <v t , {,# 



Derivation of the rule corresponding to Arg2 

Form of the theorem to be generated: DG(APPENEK<Vj, «,,;;>, <Assign(v 2 , t? 2 ,; 2 ), / 2 ,; 2 +l>)) s 3€(?rhs 2 ) 

^(APPEN^Vj, /^^^AssignCVj, e 2 ,7 2 ), / 2 ,7 2 +l>))*: 

if_then_else(/ I =j 2 +h ^(<v,, i^tf), 

Enqueuc(Append(a£«v 1 , i^jf), %«v 2 , {, j 2 >)), D€<<? 2 ))) 

Rules used for simplification: 



Initial State of the Temporary World 

Static Part 

(5) 3(<v, i, i>) -f True 

(6) 5(<Assign(v, e, j), i, j+ 1>) -♦ i < j+ 1 A [i = j+ 1 V 5(<v, i, j>)] 
(7)[3(<v,i,j>)=>i<j]-»True 

Arguments- Assertions 
(8)3«v l ,/ 1 ,; 1 »-True 

(9) 3(< Assign( v 2 , <? 2 , J), / 2 , ; 2 + 1» -+ True 

("The following is as a consequence of Rule (9)") 
(9a) » 2 < v 2 +l A [/ 2 = j 2 +l V 3(<v 2 , / 2 4»] - True 

(10) 3(<v 2 , / 2 4>) => 3(APPEND«v 1 , /,,./,>, <v 2 , / 2 ,; 2 >)) - True 
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Stcp (1) Invoke Synthesis Rule (1) on die expression 3G(APPEND(< v v /,, yj>, < y i r ' 2 >» 
^(APPEN^Vj, /,,;,>, <v 2 , i r / 2 >)) == 

if_then_else(/ 2 = ; 2 + 1, %(< v v i v j,>), 

EnqueueCAppcnd^^v^ /,,;,», D€(<v 2 , / 2 ,; 2 >)), 3G(e 2 )» 



Step (2) Expand: Occurrence: 3.1 

Expression: Appcnd(3€(<v |( j,,;^), %(<v r / 2 ,./ 2 >)) 
Using Rule: (3) 



3G(APPEND(<v 1 , /,,;,>, <Assign(v 2 , e^j), ^,/ 2 +l») s 
if_thcn_elsc(/ 2 = ; 2 + 1, 3G(< v,, /,,;;>), 

Enqueue^ APPEND^, i^j?, <v 2 , z 2 ,; 2 >)), 36(e 2 )» 



Step (3) Transform: Occurrence: 3.1.1 

Expression: APPEND(<v , i v j^>, <v 2 , ^.Jj>) 
Using Rule: where-rule (1) 

MCAPPEND^, /,,;,>, <Assign(v 2 , e 2 ,; 2 ), / 2 ,j 2 +l>)) a 
if_then_clsc<; 2 = 4+1. 36«v 1 , i^p), EnqueucCJGKv, i,j>), D6(e 2 )) 

where < v, i, j> is APPEND(< v? i v j t >, < v 2 , i 2 , y 2 ») 



Step (4) Expand: Occurrence: 3 

Expression: Enqueue(36(<v, i,p), 36(e 2 )) 
Using Rule: 

TW Update: 

Added because expression is in scope of else-arm 
f 2 =y 2 +l False 

t, < 4+ 1 A 5(<v 2 , ^>) -» True 

5(<v 2 , /,,/,» -True 

5(APPEND«v | , j,4>, <v 2 , / 2 ,; 2 >)) - True 
Added because expression is in scope of when 
5(<v, 4;>)->True 

/<;True 

WPPENDC^, i^y, <Assign(v 2 , e t , & * 2 ,7 2 + 1>)) a 
if_then_else(t I = ; 2 +l, 36(<V /,, j 2 >), %(<Assign(v, e 2 , j|, w+l>) 
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where<v, i,j> is APPEND(<v,, /,,/,>, <v 2 , / 2 ,; 2 >)) 



Step (5) Transform: Occurrence: 

Using Rule: where-rule (2) 

if_then_clse<7 2 = Ji+L Wl<\, ',.;,». 3£(<Assign(v, e^j), i,j+l>)) 
where <v,i,j>k APPEND^, *,,;,>, <v 2 , * 2 ,,/ 2 >) 



Stcp(6) Expand: Occurrence: A 
Using Rule: (4) 

DGfAPPEND^v,, /,,;,>, <Assign(v 2 , e^jj, * 2 ,./ 2 +l>)) s 

3G(if_thcn_elsc(* 2 = ; z + l. < V ^ j x >, <Assign(v, e % , j), i,j+ 1>)) 
where <v, i,p is APPEND(< v t , /,,./,>, <v 2 , / 2 ,; 2 >) 



APPEND<< v r ^, 7 ;>, <Assign(v 2 , e 2 ,; 2 ), i 2 ,j 2 + 1» -» 
if ' 2 =; 2 +lthen<v 1 , fj,;^ 
else <Assign(v, e^J), 4/'+l> where <v, A/> is APPEND(<Vj, i,,^), <v 2 , / 2 J 2 >) 

Definition of APPEND 

APPEND(<v,, / lt ;;>, <v 2 , * 2 , ? 2 >) — <v r i v j? 
APPEND(<v 1 , jj,j^>, <Assign(v 2 , e^j), / 2 ,; 2 +l>) -» 

if J 2 =7 2 +lthen<Vj, {,# 

else <Assign( v, <? 2 , j), i, j+ 1> where < v, i, j> is APPEND(< v^ / t , ^>, < v 2 , z 2 , ; 2 >) 
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i 6. Stage 2: The Target Implementation 

, The second stage of the synthesis procedure transforms the preliminary 

implementation of the implemented type into a target implementation. For instance, in the 
example implementing Queuejnt in terms of Circ_List, the preliminary implementation 
derived in the last chapter (shown Fig. 5 of chapter 2) is transformed into a target 
implementation such as the one shown in Fig. 0. 

There are two differences between a preliminary implementation and a target 
implementation. The first one is that in a preliminary implementation the only operations of 
the representation type allowed to appear are the generators of the type. The target 
implementation may also contain nongenerators of the type. The second difference is in the 
function definition methods used by the two forms of implementation. In a preliminary 
implementation a function is defined by means of a set of rewrite rules. For example the 
preliminary implementation of ENQUEUE (Fig. 5) is: 

ENQUEUE(Create, j) -» Insert(Create, j) 
ENQUEUE(InseiKc, 0, j) ^ Insert<ENQUEUE(c, j), i) 

In a target implementation a function is defined by means of a single expression. For 
example, ENQUEUE is defined as: ENQDEUE(d,k)::= Rotate(lnsert(d, k)). The 
transformation performed takes into consideration both of these differences. 

It should be noted that a preliminary implementation is an executable 



Fig. 18. An Implementation 
NULLQO :: = CreateO 

ENQUEUE(c, j) :: = Rotate(lnsert(c, j)) 

FRONT(c) :: = Value(c) 

DEQUEUE(c) :: = Remove(c) 

APPEND(c, d) :: « Join(d, c) 

SIZE(c) :: = if Empty(c) then O 

else SIZE(Remove(c)) + 1 
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implementation. It can be executed by an interpreter that simplifies algebraic expressions 
using the rewrite rules in the preliminary implementation and the specifications of the 
implementing types. The interpreter must have a pattern matching capability to invoke the 
appropriate rewrite rule while simplifying an expression. The program verification system 
AFFIRM [39], and the programming system PROLOG [??] provide such an interpreter. 
Given the specifications of all the implementing types, the interpreter can execute the 
preliminary implementation on any given input. For example, the value returned by the 
operation (of Queuejnt) Front on the queue constructed by Enqueuc(Nullq, 1) is obtained 
by finding the normal form of FRONT(ENQUEUE(NULLQ( ), 1)) using the preliminary 
implementation: "The normal form is 1. Depending on the range type of the operation, the 
normal form can, in general, be a generator constant of any of the implementing types. The 
normal form can then be evaluated assuming there exist implementations for the 
implementing types. 

Our goal is to derive the target implementation in a form that can be compiled by a 
compiler for an applicative language. The motivation for this is primarily one of efficiency. 
There are two reasons why a target implementation is more efficient than a preliminary 
implementation. The first one arises because of the freedom to use nongenerators of the 
representation type in a target implementation. This enables one, in some instances, to 
eliminate recursion from the preliminary implementation of an operation, and to transform it 
into a target implementation which is merely a composition of the operations of the 
implementing types. The implementation of ENQUEUE shown above is an instance of such 
a situation. The use of the operation Rotate in the target implementation eliminates the 
recursion which was essential in the preliminary implementation. The second reason is that 
an implementation that can be compiled by means of a conventional compiler is in general 
more efficient than interpreting a set of rewrite rules. 

We develop two methods of deriving a target implementation from a preliminary 
implementation: The Recursion Preserving Method, and the Recursion Eliminating Method. 
Both the methods are based upon expansion using rewrite rules. The target implementations 
derived by the first method preserve any recursion that may exist in the corresponding 
preliminary implementations. The second method can eliminate recursion from a 
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preliminary implementation of an operation if there exists a nonrecursive implementation for 
the operation. The second method is more general because it can also derive the 
implementations derived by the first method. The advantage of the first method is that it is, 
in general, faster than the second in situations where the twO methods derive the same target 
implementation. 

6.1 The Recursion Preserving Method 

This method uses a special set of functions, called the inverting functions, on the 
implementing types for transforming a preliminary implementation into a target 
implementation. To understand what inverting functions are and how they are useful in 
deriving a target implementation, let us take a closer look at the difference in the function 
definition methods used by the two forms of implementation. The preliminary 
implementation for SIZE is 

SIZE(Create)->0 
SIZE(Insert(c, i)) -» SIZE(c) + 1, 

and a possible target implementation for it is 

SlZE(d) :: = if Empty(d) then 

else SIZE(Remove(d)) + 1. 

In the preliminary implementation, the argument to SIZE on the left hand side of a 
rule may be a generator expression. The argument indicates the structure of the expression 
that constructs the values for which the rewrite rule is applicable. This freedom serves two 
purposes in a preliminary implementation. Firstly, it is used for performing a case analysis 
based on the structure of the argument. Secondly, the explicit indication of the structure of 
the arguments on the left hand side makes the decomposition of the arguments trivial. For 
instance, in the second rewrite rule for SIZE the variable c used on the right hand side is 
actually a component of the argument to SIZE. We were able to access this component 
without actually having to generate code to decompose the argument 

In a target implementation, the argument to SIZE on the left hand side of the 
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definition is a variable. This means that the expression on the right hand side of the 
definition must have explicit pieces of "code" to perform the case analysis based on the 
structure of the argument, and to decompose the argument For instance, in the target 
implementation of SIZE given above, the subexpression Remove(d) extracts the component 
of the argument d that is denoted by the variable c in the preliminary implementation. The 
subexpression Empty(d) checks if d is a value constructed by Create; the if_then_else 
expression performs the desired case analysis. Let us call the subexpressions that perform 
these functions mentioned above inverting expressions. 

A preliminary implementation can be systematically transformed into a target 
implementation if the inverting expressions can be generated automatically. The inverting 
functions of the implementing types serve precisely this purpose. For instance, in the above 
example Remove and Empty are two of the inverting functions for Circ_List The inverting 
expressions can be automatically derived in terms of the inverting functions. Thus, the 
transformation of a preliminary implementation into a target implementation according to 
this method consists of two steps: First, determine the inverting expressions in terms of the 
inverting functions; second, derive implementations for the inverting functions in terms of 
the operations of the implementing types. The two subsections to follow describe the two 
steps. 

6.1.1 Inverting Functions and Inverting Expressions 

Inverting functions 23 of a data type are a family of functions on the type that are 
inter-related in a special way. Inverting functions are defined with respect to a basis of the 
type. The relationship among the inverting functions of a family is such that the functions 
can be used to algorithmically invert the process of constructing a value from the generators 
of the type. In other words, it is possible to construct algorithmically the inverting 



23. Inverting functions are related to distinguished /unctions defined in [24]. A family of inverting 
fimctions for a data type can also serve as a family of distinguished functions. The reverse implication 
is not true in general. In [24] distinguished functions are used to formalize the expressive power of a 
datatype. 
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expressions as a composition of appropriate inverting functions. The inverting expressions 
perform the following functions: 

(1) Given a variable v and a generator expression t, check if the value denoted by v can 
be constructed by a generator expression that has the form of t. Since an inverting 
expression that performs this function is normally a boolean expression, we call it a 
boolean in vert ing expression. 

(2) Assuming that a giyen variable v denotes a value that is constructed by an expression 
that has the form of a given generator expression t, determine the various 
components oft from v. We call an inverting expression that performs this function 
a component inverting expression since it extracts a component of a generator 
expression. 

For example, the operations Remove, Value, and -(Empty) can serve as a family of 
inverting functions for CircJList This is because the inverting expressions for any generator 
expression of Circ_List can be automatically constructed from these operations. For instance, 
suppose v is a variable of type Circ_List, and t = Insert(Insert(c, i), j) is the generator 
expression under consideratibn. The following are some of the inverting expressions fort: 

(1) Not(Empty(Remove(v))) is a boolean inverting expression for t. It checks if v 
denotes a value constructed by a generator expression that has the form oft. 

(2) Some of the component inverting expressions of t are Value(v) which extracts j, 
Remove(Remove(v)) which extracts c, and Value(Remove(v)) which extracts i 

Let us now formalize the properties that characterize a family of inverting functions 
for an arbitrary data type. We express the properties in the form of rewrite rules. The 
properties are such that they do not necessarily characterize a unique set of functions. This is 
done deliberately to offer flexibility in choosing an implementation for the inverting 
functions. Inverting functions are always defined with respect to a basis for the data type. 
Let the basis for the data type be & = {a. J i>0}. Inverting functions can be classified into 
two categories: the component inverting functions and the boolean inverting functions. 
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(1) There is a set of n component inverting functions (d t , . . . , d n ) associated with every 
generator a. in the basis whose arity is n. They are characterized by the following 
property: 

afd l (a.(\ v . . . , v n )), . . . , i n W i (y v ■ ■ • , yj)) -» o.(y v . . . , yj 
A generator whose arity is zero does not have any associated component inverting 
functions. The component inverting functions associated with or. factor a value 
constructed by a.. They return the arguments used by a { in constructing the value. 
At the outset it may appear more natural to characterize the component inverting 
functions as follows: d.(a .(v r . . . , v^) -> v.. The problem with such a 
characterization is that it may result in ill-defined component inverting functions in 
situations where the generators can be used in more than one way to construct the 
same value. For instance, consider the basis S = {0, 1, +} for NaturaLNumbers. 
If d x associated with + is defined as d x (x+y) -+ x, then we have a situation where 
djCO+l) = and d^l + 0) = 1. This will conflict with the rest of the specification 
of type NaturaUNumbers which should allow us to prove that (0+ 1) = (1 +0). 

(2) There is a boolean inverting function associated with every generator in the basis. 
The boolean inverting function, p., associated with a generator a. returns True on 
values that can be constructed by a generator expression that has the form 
afy v ..., v k ). So, pj is characterized by p.(v) -♦ o^dft), . . . , d n (v)) = v, where = 
is the equality operation on the type. Thus, the recursion preserving method in 
general applies only when each of the implementing types has the equal operation 
defined on it A simpler characterization, which applies only when the basis is such 
that every value of the type can be constructed uniquely using the generators is as 
follows: 

p.(<r i (v 1 ,...,v n ))-Tnie. 
P j (a j (v 1 ,...,v ii ))- + False (i * j) 

The basis for Circ_List is a = {Create, Insert}. It has two component inverting 
functions (dj and d 2 ) both of which are associated with Insert, and characterized by 
Insert(dj(insert(v, i)), d 2 (Insert(v, i))) -* Insert(v, i). It has two boolean inverting functions, p t 
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and p 2 , one associated with Create and the other associated with Insert. They are 
characterized as follows. (Note that the generators of CircJList are such that every circular list 
can be constructed uniquely in terms of the generators.) 

p^Create) — ► True 
p^InsertCc, i)) -+ False 

p 2 (Insert(c, i)) -* True 
p 2 (Create) -> False 

Notice that p l and p 2 , in this case, are complement of each other. So, while deriving 
implementations for the inverting functions, we implement only p t ; p 2 is obtained as a 
negation of p r 

It is not hard to see how a preliminary implementation can be transformed into a 
target implementation in terms of the inverting functions. Fig. 19 gives a general procedure 
that does it for an arbitrary preliminary implementation. In the following, we illustrate the 
procedure on the preliminary implementation of SIZE. The preliminary implementation 
SIZE consists of the following rewrite rules. 

SIZE<Create}-+0 
SIZE(Insert(c, i)) - SIZE(c) + 1 

Suppose the left hand side of the target implementation is SIZE(v). The expression on the 
right hand side is a nested if_then_else expression that performs a case analysis. There is a 
case corresponding to every rewrite rule in the preliminary implementation. In the present 
case the right hand side would have the following form: 

ifbjthenej 
else if b 2 then e 2 

The expressions bj and ej are determined from the first rewrite rule using the inverting 
expressions associated with the generator expression that appears as the argument to SIZE on 
the left hand side of the rewrite rule. The expressions b 2 and e 2 are determined similarly from 
the second rewrite rule. We will describe how b 2 and e 2 are determined since they are more 



128- 



iFig. 19. The Procedure RPM 

Suppose the preliminary implementation of F consists of the following rules: 



4 

Then, the target implementation for F is 

F(v):: = ifbjthenSj ' 

else if b 2 then s 2 
f 

else if b then s 

n n 

where 

(1) b : is the boolean inverting expression of g ( which is obtained by the procedure BIE described 
below. 

(2) Sj is the expression obtained by replacing every terminal in ^ by the component inverting 
expression of gj that extracts te terminal. This is obtained by the procedure CIE described 
below. 

For convenience, we assume that the generators have an arity that is at 
most one. 

CIE = proc(a: generator expression, u: Occurrence) 

returns (component inverting expression) 

Suppose a is o(aJ 

d is the d-function associated with a 

if u = A then retum(X) 

else if u = l.v then return(d <> QE(o,, v) 

end CIE 

BIE = proc (a: generator expression) returns (boolean inverting expression) 

if a is a variable then return(X) 
else if a = o{a^ 

then return(p • A • BIEl(a,, d)) 

where p is the boolean inverting function associated with a 
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d is the component inverting function associated with a 

BIK1 = proc («: gcnanitor expression, il: inverting function symbol) 

returns ( boolean inverting expression) 

if a is a variable then return(A) 
else if a = ct(«,) 

then return (» ° d ° H1K(«,)) 

where p is the boolean inverting function associated with a 
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interesting than the determination of bj and e r b 2 is the expression that determines if v 
denotes a value constructed from an expression that has the form of Insert(c, i), so b 2 is p 2 (v). 
e 2 is identical to SIZE(c) + 1 except for the following modification: The variables c and i, 
which denote the components of the expression appearing as argument to SIZE on the left 
hand side of the rule, are replaced by the corresponding inverting expressions that extract 
those components from v. That is, c is replaced by d^v) and i is replaced by d 2 (v). So, e 2 is 
SIZE^v)) + 1. bj and e l can be determined similarly. So the target implementation for 
SIZE in terms of the inverting functions is below: 

SIZE(v)::=ifp 1 (v)thenO 

else if p 2 (v) then SIZE^v)) + 1 

6.1.2 Implementations for the Inverting Functions 

Implementations for the inverting functions are derived using the recursion 
eliminating method described in the next section. Note that the properties characterizing the 
inverting functions are expressed by means of a set of rewrite rules. Implementations for the 
inverting functions are determined by searching for appropriate compositions of the 
operations of the implementing types that satisfy the rewrite rules characterizing the inverting 
functions. In the following we show the theorem generation sequences that derive 
implementations for each of the inverting functions used above. 
Derivation for d y and d 2 

Relevant Rewrite Rules used for Expansion 



(1) Value(Create) -► ERROR 

(2) Value(Insert(c, i)) -♦ i 

(3) Remove(Create) -* ERROR 

(4) Remove(Insert(c, i)) — » c 



Form of the theorem to be generated: Insert(v, i) s InsertCf^CInsert^, 0). f* 2 (Inscrt(v, i))) 
Normal form of Insert(v, i): Insertfv, i) 
Rules used for the normal form: None 
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Stcp (1) Invoke Synthesis Rule (1) on 
Insert(v, i) = Insert(v, i) 



Step (2) Expand Expression: v 
Using Rule: (4) 



Insert(v, i) = Insert(Rcmovc(Inscrt(v, i)), i) 



Step (3) Expand Expression: i 
Using Rule: (2) 



Insert(v, i) = Insert(Remove(Inscrt(v, i)), Value(Insert(v, i))) 



The above theorem determines the following solutions for f* and f* 2 : Remove and Value. Therefore, 
we have the following implementations for d t and d y 



dj(v)::= Rcmove(v) 
d 2 (v)::= Value(v) 



Derivation for p } 

Relevant Rewrite Rules used for Expansion 



(8) Empty(Create) -+ true 

(9) Empty(inscrt(c, i)) -+ false 



Form of the theorem to be generated: True = f*(Create)) 

Normal form of True: True 

Rules used for the normal form: None 

Step (1) Invoke Synthesis Rule (1) on True 
True s True 



Step (2) Expand Expression: True 
Using Rule: (8) 
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True = Empty(Create) 



The last theorem determines the following solution for i*: Empty. Note that this function also satisfies 
the other rewrite rule characterizing p r namely Pj(Inscrt( c,i)) -+ False. Therefore, p x can be 
implemented as follows: 

Pj(v)::= Empty(v) 

6.2 The Recursion Eliminating Method 

Let us suppose we are deriving a target implementation for an implementing 
function F whose preliminary implementation consists of the set of rewrite rules given below. 

*fei)--ti 

• ■ ■ ■ 

We assume that F is a single variable function for convenience. The general description of 
the method given below can be extended easily to a multivariable function. In a target 
implementation, the function F is defined as F(v) ::= e, where v is a variable, and e is an 
expression containing v and any of the following function symbols: 

(1) Operations of the implementing types 

(2) The implementing functions 

(3) The function if_then_else 

Let us denote e as f*(v), where f* is some composition of the function symbols listed 
above. The derivation of a target implementation consists of finding a suitable f* The 
composition I* should be such that the function defined by F(v) ::= f*(v) has the same 
behavior as the one defined by the set of rewrite rules given above. 

To characterize the problem formally, we define the following concept A 
composition f* satisfies a rewrite rule of F if the equation obtained by substituting f* for F on 
both the sides of the rewrite rule is a theorem of the rewriting system consisting of the 
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preliminary implementation and the specifications of the implementing types. For example, 
the composition Rotate(Insert(d, k)) satisfies the rule 

ENQUEUE(Insert(c,i),j)^Insert(ENQUEUE(c,j),i) if the equation 

Rotate(Insert(lnsert(c, i), j)) = Insert(Rotate(lnsert(c, j)), i) is a theorem. 

The composition P to be derived should be such that f satisfies each of the rewrite 
rules in the preliminary implementation of F. That is, the following equations should be 
theorems. (The notation tJF <- f*] denotes the expression obtained by replacing F by f* in 
t r ) 

- f*( gl ) s t t [¥ - P] 

• ■ 

f*(g n ) = tJF - 1*1 

The purpose of the above formulation (of the condition that a solution for f* is 
supposed to satisfy) is to allow us to use a theorem generation strategy similar to the one used 
in deriving a preliminary implementation. We generate a theorem using one of the above 
equations as a template by treating- 1* as a place holder in the equation. Let us call this 
equation the template equation. A theorem that has the form of the template equation 
determines a candidate for 1*. A single theorem may determine more than one candidate for 
f* but only finitely many, because the expressions we are dealing with have finite size. The 
candidate(s) can be determined automatically by comparing the theorem with the template 
equation. The goal is to generate a theorem that not only has the form of the template 
equation but is also such that the candidate for f* satisfies the rest of the equations in the 
preliminary implementation of F. 

The generation of theorems is carried out in the same fashion as in deriving the 
preliminary implementation. We use the same set of synthesis rules developed earlier. The 
theorems that are of interest to us in the present situation involve only the operations of the 
implementing types and the implementing functions. Therefore, the rewriting system that is 
used for performing expansion (while generating the theorems) consists of the preliminary 
implementation and the specifications of the implementing types. In contrast, the rewriting 
system used in the derivation of the preliminary implementation consisted of the 
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specifications of the implemented type and the association specification. Note that the 
preliminary implementation did not exist at that time. Checking if a candidate for 1* satisfies 
the rewrite rules essentially involves checking if an equation is a theorem. 

Let us illustrate the method on the derivation of the target implementation for 
ENQUEUE shown earlier. The preliminary implementation of ENQUEUE is repeated 
below for ease of reference. 

ENQUEUE(Create, j) -» Insert(Create, j) 
ENQUEUE(Insert(c, i), j) - Insert(ENQUEUE(c, j), i) 

The f* to be derived should be such that the following equations are theorems. (Note that the 
equations are obtained by replacing ENQUEUE by P in the rewrite rules, and then 
interchanging the two sides. The reason for interchanging the sides will be explained shortly.) 

(1) Insert(Create,j) = f*(Create, j) 

(2) Insert(f*(c, j), i) = P(Insert(c, i), j) 

We use equation (1) as the template equation. The nature of our synthesis rules imposes 
certain restrictions on the equations that can be used as template. The synthesis rules are 
formulated to generate theorems with a known left hand side, but an unknown right hand 
side. So, the template equation should be such that the unknown entity P appears only on 
the right hand side. In equation (2) both sides are unknown since P occurs on both the sides. 
This was also the reason behind interchanging the two sides of the rewrite rules while 
obtaining the above equations. Note that there always exists at least one equation with a 
known right hand side. This corresponds to the rewrite rule in the preliminary 
implementation of F that represents the basis case. 

Shown below is a sequence of steps that generates a theorem that gives rise to a 
target implementation. 

Relevant Rewrite Rules used for Expansion 



(3) Rotate(Create) -» Create 

(4)Rotate(Insert(Create, i)) -* Insert(Create, i) 

(5) Rotate<Insert(Insert(c, il), i2)) -* Inscrt(Rotate(Insert(c, i2)), il) 



-135 



Form of the theorem to be generated: Insert(Create, j) = f*(Create, j) 
Normal form of Inscrt(Creatc, j): Inscrt(Create, j) 
Rules used for the normal form: None 

Step (1) Invoke Synthesis Rule (1) on Insert(Create, j) 
Insert(Creatc, j) = Insert(Create, j) 



Step (2) Expand Expression: Insert(Crcate, j) 
Using Rule: (4) 



Insert(Crcate, j) = Rotatc(Insert(Create, j) 



The right hand side of the last theorem generated in the above series has the form of 
l*(Create, j), and hence can be used to generate a set of candidate compositions. A candidate 
composition is determined from three expressions: 

(1) the left hand side of the target implementation, say F(v p . . . , v^ 

(2) the right hand side of the theorem generated, say o, and 

(3) the right hand side of the template equation, say t*(g v . . . , gj. 

It is obtained by replacing zero or more occurrences of g., for every 1 < i < n, in a by a 
variable v., 1 < j < n. The replacement of g. by v. is made so that type consistency is 
preserved. 

For the current example, the left hand side of the target implementation is 
ENQUEUE(d, k) :: = ?; the right hand side of the theorem generated is Rotate(Insert<Create, 
j); the right hand side of the template equation is f*(Create, j). So, there are two candidates 
for f*(d, k): (1) Rotate(Insert(d, k)) and (2) Rotate(Iiisert(Create, k). 

The second candidate does not satisfy equation (2). The equation obtained by 
replacing f* in the equation by the candidate is 
Insert(Rotate(Iiisert(Create,j)),i)=Rotate(Insert(Create,j)). This is not a theorem of 
Circ_List because (for every i and j) both the sides of the equation remain simplified, but will 
not be identical. (This can be checked by Is-an-inductive-theorem-of.) 
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Let us consider the first candidate. The equation obtained by substituting it for f* in 
equation (2) is Rotate(Insert(Insert(c, i), j)) = Insert(Rotate(Insert(c, j)), i), and this is a 
theorem of Circ_List. (The left hand side of the equation reduces to the right hand side by 
the rewrite rule (5).) Hence Rotate(lnsert(d, k)) satisfies equation (2). The second candidate 
does not satisfy equation (2). Hence the target implementation is: 

ENQUEUE(d, k) ::= Rotate(Insert(d, k)) 

6.3 An Illustration of a Complete Synthesis 

In the following, we illustrate the complete synthesis, i.e„ an illustration of both the 
stages, of two examples. The first one derives a target implementation for the operation 
Append of Queuejnt using the association specification that specifies the Circ_List 
representation. The second example derives a target implementation for the Front using the 
association specification that specifies the <Array_Int X Integer X Integer> representation 
(see chapter 5). 

Illustration 1 
Stage 1: 

Partial Preliminary Implementation of Append at Hand 

APPEND(e, Create) -» Trhs, 
APPENEKc, Insert^/ )) -» ?rhs, 

Relevant Rewrite Rules of the Perturbed World 

(10) Appcnd(q, Nullq) -*q 

(14) 3fi(Create) -♦ Nulla, 

(20) 3G(ENQUEUE(c, i)) -* Enqueue(36(c), %(i))}) 

(22) 3£(APPEND(c, d)) - Append(3G<c), %(&)) 

Derivation of the first rewrite rule 

Form of the theorem to be generated: D6(APPEND(c, Create)) s DG(?rhs $ ) 
Normal form of 3fi(APPEND(c, Create)): %(c) 
Rules used for the normal form: (22), (14), (10) 

Step (1) Invoke Synthesis Rule (1) on D€(APPEND(c, Create)) 



137 



D€(APPEND(c, Create)) = %(c) 



The above theorem is such that APPEND(c, Create) >~ c. Therefore the desired rewrite rule is: 
APPEND(c, Create) -> c 



Derivation of the second rewrite rule 

Form of the theorem to be generated: 3£(APPEND(c, Insert(Create, /))) = D€(?rhs,) 
Normal form of OG(APPEND(c, Insert(Create, /))): Enqueuc(3£(c), Dfi(/)) 
Rules used for the normal form: 

Step (1) Invoke Synthesis Rule (1) on DG(APPEND(c, Insert(Create, /))) 
D€(APPEND(c, Insert(Create, j))) & Enqucuc(DG(c), %(/)) 



Step (2) Expand Expression: Enqucuc(Dt(c), %(i)) 
Using Rule: (10) 



D€(APPEND(c, Inscrt(Create, i))) s Append(Enqucuc(3£(c), %(i% Nullq) 



Step (3) Expand Expression: Nullq 
Using Rule: (14) 

3£(APPEND(c, Insert(Creatc, i))) s Append(Enqueue(36(c), 3€(i)), tJG(Create)) 



Step (4) Expand Expression: Enqueue(3&(c), 3£(/)) 
Using Rule: (20) 



D€(APPEND(c, Insert(Create, i))) a Append(3G(ENQUEUE(c, i% 3€(Create)) 



Step (5) Expand Expression: Append(D6(ENQUEUE(c, »)), D€(Create)) 
Using Rule: (22) 

3G(APPEND(c, Iasert(Create, i))) s D€(APPEND(ENQUEUE<c, i), Create)) 
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Step (6) Generalize the theorem in step (5) by replacing the constant 
Create by the variable d to obtain the following equation: 
%(APPEND(c, Insert^,/))) = 0€(APPEND(ENQUEUE<c, i), d)) 

Apply Is-an-inductive theorcm-of on the above equation. 
This yields True confirming thai the equation is a theorem. 



Hence the desired rule (obtained by dropping % on both sides) is: 

APPENEKc, Insert^,/)) -► APPEND(ENQUEUE(c, /), d) 



Stage 2: 

Preliminary Implementation at Hand 

APPEND(c, Create) -► c 

APPEND(c, Insert^,/)) -» APPEND(ENQUEUE{c, /), d) 

Desired Form of Target Implementation 
APPENEKVj^j):^?? 

Relevant Rules of Circjist 

(10) Join(c, Create) -+ c 

(11) Join(c, lnsert(d, i)) —* Insert(Join(c, d), i) 

Template Equation Chosen: c = APPEND(c, Create) 
Form of the theorem to be generated: c — f*(c, Create) 
Normal form of c: c 
Rules used for the normal form: None 

Step (1) Invoke Synthesis Rule (1) on c 
esc 



Step (2) Expand Expression: c 
Using Rule: (10) 



c s Join(c, Create) 



Step (3) Find a suitable candidate composition. 
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The right hand side of the above theorem has the form of F*(c, Create). So, find a suitable candidate 
composition. There arc two possibilities: (I) Joii^Vj, v 2 ). and (2).)oin(v 2 , \ { )}. The second candidate 
satisfies the second rule of the preliminary implementation, but the first does not. So, a possible target 
implementation is: 

APl'KNDfy.Vj):^ .loin(v 2 , y,) 
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h Illustration 2 

Stage 1: 

Partial Preliminary Implementation of Append 

FRONT(<v,U>)->?rhs, 

FRONT(<Assign(v, e, i), 4 i+ 1>) -» ?rhs 4 

FRONT(< Assign( Assign( v,e r j), e v j+ 1), 4 j+ 2>) -> ?rhs 5 

Relevant Rewrite Rules of the Perturbed World 

(l)3G(<v,U>)->Nuliq 

(2) 3G(<Assign(v, e, j), i, j + 1>) -+ if_then_elsc(i = j + 1, Nullq, Enqucue(Dfi(< v, i, j>), %(e))) 

(3) 3€(FRONT(x)) -» Front(36(x)) 
(4)X(ERROR)-+Error 

(5) Dt(if_then_else(b, v lt yj) -> if_then_clsc(b, Kfy), 3£(v 2 )) 



Derivation of the first rewrite rule 

Form of the theorem to be generated: 3€(FRONT(<v, 4 t>)) == !)fi(?rhs 1 ) 
Dt(FRONT(<v, 4 *>))*: Error 

Rules used for simplification: 

Step (1) Invoke Synthesis Rule (1) on 3G(FRONT(<v, 4 />)) 
3G(FRONT(<v, i, i>)) s Error 



Step (2) Expand Expression: Error 
Using Rule: (4) 



Dfi(FRONT(<v, 4 i>)) = 3G(ERROR) 



FRONT(<v, 4 i>) -» ERROR 



Derivation of the second rewrite rule 



Form of the theorem to be generated: 3€(FRONT(<Assign(v, e, i), 4 i+ 1>)) ss DtCTrhSj) 
36(FRONT(<Assign(v > e 1 /),4/+l>)H: 36(e) 
Rules used for simplification: 
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Step (1) Invoke Synthesis Rule (1) on Dt(FRONT(<Assign(v, e, i), i, i+ 1>)) 
M(FRONT(<Assign(v, e, /), i, i+ !>)) = %(e) 



FRONT(<Assign(v, e, /), i, i+l>) -» e 



Derivation of the third rewrite rule 

Form of the theorem to be generated: DHFRONT^AssignCAssignCv,^,./), e t J+ 1), i,j+2>)) s 3t(?rhs 3 ) 
^(FRONTKAssignCAssignCv,^;), e { ,j+\), i,j+2>))l: 

if_thcn_clsc(/ = j+l, Error, if_thcn_clsc(/ = j+ 1, %(ej, 

Front(Enqucue(3€(< v, 4 j>), ej))) 
Rules used for simplification: 

Stcp(l) Invoke Synthesis Rule (1) 

^(FRONT^AssignCAssignCv^.A e r j+\), i,j+2>)) a 

if_then_else(' = j+2, Error, if_then_else(' = j+ 1, %(ej, 

Front(Enqueue(36(<v, i, />), ej))) 



Step (2) Expand Expression: Front(Enqueuc(DG(< v, i, y>), ej) 
Using Rule: (2), Protocol 3 



TW Update: 

/ = y+2-» False 
/ = _/+!—» False 



^(FRONT^AssigiKAssignCv,^^), e r j+l), i,j+2>)) = 

if_then_else(' = J+2, Error, if_then_else(i = /+ 1. 3t(« 2 )i 

Front(D€« Assign(v, e t , j), 4 j+ !»))) 



Step (3) Expand Expression: 36(< Assign( v, e, j), i, j+ 1>) 
Using Rule: (3) 

DGCFRONT^AssigiKAssigiKv^y), e^j+l), i,j+Z») m 

if_then_else(f = j+2. Error, if_then_else<* - j+h We), 

36(FRONT«Assign(v, e^j), i,j+l>)))) 
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Step (4) Expand Expression: Error 
Using Rule: (4) 

^(FRONTKAssignlAssigntv,^,;), e v j+\), /,y+2>)) = 

if_then_elsc(/ = j+ 2, DG(ERROR), if_thcn_else(/ = ;+ 1, %(e 1 X 

D€(FRONT(<Assign(v, e v j), ij+l>)))) 



Step (5) Expand Expression: if_thcn_else(/ = j+l, K(ERROR), if_then_else(* = ;'+ 1. ^(«,). 

36(FRONT(<Assign(v, e^j), 4y+l>)))) 
Using Rule: (5) 



D€(FRONT(<Assign(Assign(v,e 1 ,», e r j+ 1), w+2>)) s 

DG(if_then_else(/ = j+ 2, ERROR, iLthcn_elsc(/ = j+l, e 2 , 

FRONT(<Assign(v, e^y), i,j+ 1»)» 



FRONT(< Assign(Assign( v,e,, y), % j+ 1), Uj+ 2>) -♦ 

if_then_elsc(* = j+ 2, ERROR, if_then_else(/ = j+ 1, e v 

FRONT(<Assign(v, e v j), i,j+ 1>))) 

Stage 2: 

Preliminary Implementation at Hand 

FRONT(< v, U i>) -* ERROR 

FRONT(<Assign(v, e, j), i, i+ 1>) -► e 

FRONT^AssignCAssigiKv.ej.y), e 2 ,y+l), i,j+l>) -» if i = j+l then ERROR 

else if i = j+ 1 then e % 
else FRONT(<Assign(v, ^.y), 4i'+ 1>) 



Let FRONT(<arr, pntl, pnt2>) be the left hand side of the target implementation. We use a slightly 
different method than the one normally used for deriving the target implementation for Front. We use 
combination of the recursion preserving method and the recursion eliminating method. First, a 
composition that satisfies the first rewrite rule is determined separately; it is easy to see that this can be 
ERROR. Then, a composition that satisfies the second and the third rewrite rules is determined. The 
two compositions are then combined with the help of a boolean inverting expression to arrive at the 
target implementation. Note that the boolean inverting expression that characterizes the argument 
structure corresponding to the first rewrite rule is pntl = pnt2. Therefore, the desired form of the 
target implementation is as below. The expression that takes the place of the else clause is to be 
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detcrmined so that the second and the third rewrite rules are satisfied. 

Desired Form of the Target Implementation 

FRONT(<arr, pntl, pnt2>) :: = if pntl = pnt2 then ERROR 

else I? 

Relevant Rewrite Rules of Arrayjnt and Arrayjnt X Integer Xlnteger 

The first two rules specify the Read operation of Arrayjnt that reads an element of an array. The third 
rewrite rule specifies the operation of a triple that selects the first component. 

(1) Rcad(Nullarray, i) -► ERROR 

(2) Rcad(Assign(v, e, j), i) — ♦ if i = j then e 

else Rcad(v, i) 

(3) First(<v, k, 1>) -+ v 



Template equation chosen: e — FRONT(<Assign(v, e, i), i, /'+ 1>) 
Form of the theorem to be generated: e = f*(< Assign(v, e, /), L, i+ 1>) 
Normal form of e: e 
Rules used for simplification: None 

Step (1) Invoke Synthesis Rule (1) on e 
e= e 



Step (2) Expand Expression: e 

Using Rule: (2), Protocol 2 



e = Read(Assign(y, e, i), i) 



Step (3) Expand Expression: Assign(v, e, i) 
Using Rule: (3) 



e = Read(First(<Assign(v, e, i), k, 1>), i) 



Step (4) Replace variables in the theorem by appropriate terminals: 
v h+ v, i i-» /, k h-+ i, 1 1-> /+ 1 
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e = Read(First(<Assign(v, e, /), i, i+ 1>), /) 



The right hand side of the last theorem generated has the form of f*(<Assign(v, e, i), i, i+l>). It 
determines the candidate composition Read(First(<arr, pntl, pnt2>), pntl), which' can be simplified to 
Read(arr, pntl). This composition is such that when it is takes the place of ?? in the partial target 
implementation shown above, the whole expression satisfies the third rewrite rule in the preliminary 
implementation. Hence, the a possible target implementation for FRONT is: 



FRONT(<arr,pntl,pnt2>)::= if pntl = pnt2 then ERROR 

else Read(arr, pntl) 
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7. Conclusions and Futare Research 

Algebraic specifications fpr data types have been extensively used to prove 
properties of data types and to establish the correctness of implementations of data types. In 
this thesis we have investigated the task of automatically synthesizing implementations for 
abstract data types starting from their algebraic specifications. In this chapter we summarize 
the major contributions of the thesis, describe the important conclusions the research has lead 
us to, and provide directions .for further research. 

One of the main decisions that we were confronted with at the start of the research 
was choosing and characterizing the inputs to the synthesis procedure. It is not surprising to 
expect as inputs the specification of the implemented type, and the specifications of all the 
implementing types. The novelty of our method lies in the use of two other inputs: the 
homomorphism information and the termination ordering. The advantages of having them 
as inputs became more evident as the research progressed. 

The homomorphism information makes the problem more tractable by restricting 
the space to be searched in finding an implementation because it imposes additional 
constraints on the synthesis equations (see chapter 4). It is informative in this respect to 
compare our method with that of Gkrent's [40]. The method developed in [40] can also be 
reformulated as a theorem generation activity within our framework. His method, however, 
is less general and less efficient than ours because he does not use the homomorphism 
information. In order to compensate for the lack of this information he is forced to severely 
restrict the form of the specifications. 

The termination ordering is not essential but is useful for automating the synthesis 
procedure. The basic method of manipulation used by the synthesis procedure is expansion 
(see section 4.4.1 and 4.5X Expansion, unlike reduction, is not uniformly terminating - even 
when the specifications are convergent (see section 3.3). This makes the synthesis procedure 
potentially nonterminating. The termination ordering circumvents this problem. It also 
ensures the termination of the implementation derived. The synthesis method used by 
Darlington [7] does not explicitly indicate the use of any termination ordering. This is one of 
the reasons that the issue of termination (that of the synthesis procedure, or that of 
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implementation derived) is not addressed in [7]. » 

An important contribution of the thesis is the development of a formal basis for the 
method used by the synthesis procedure. The development is influenced significantly by the 
techniques used for verifying the correctness of implementations of algebraically specified 
data types. The synthesis method has two distinguishing features. The first is that it is based 
on the general principle of reversing the techniques of program verification. The second is 
the decomposition of the procedure into two stages. 

The reverse program verification principle lead us to view the synthesis problem 
(see chapter 4) as one of generating a set of theorems that satisfy the synthesis conditions. 
The synthesis conditions characterize the situations in which a set of theorems of the input 
specifications is guaranteed to yield a correct implementation. The synthesis rules provide a 
means of generating theorems from a specification. This approach to synthesis has two 
advantages. Firstly, it makes the formal justification of the correctness of the synthesis 
method simple because the synthesis conditions are based on a criterion of correctness for 
data types. Secondly, it allows us to build on the research in the area of program verification - 
past as well as future. This naturally suggests an area in which to pursue future research. It 
concerns extending the theory in which the synthesis procedure operates. Currently it 
operates in the part of inductive theory of the specification that is decided by the Musser/KB 
method (see chapter 4) of proving equational and inductive properties of rewriting systems. 
This extension would involve developing new synthesis rules, and new ways of using the 
synthesis rules for generating theorems. One might, for example, look into ways of 
assimilating the proof techniques used by various verifiers [5, 27] into our framework. 

Another advantage of decomposing the procedure into two stages is that it makes 
the procedure more modular. It isolates the part that is dependent on the target language. So 
modifications to the target language can be made without drastically affecting the synthesis 
procedure. A possible extension to the thesis that could be considered is to incorporate more 
equivalence preserving transformations into the second stage. The transformations can be 
either of an efficiency improving nature, or language developing nature such as applicative to 
imperative transformations. 

In addition to characterizing the inputs, an important contribution of the thesis is 
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t the characterization of the generality of the synthesis method. The thesis formally 
; characterizes (see chapter 2 and section 3.3) the restrictions on the inputs, and the conditions 
; under which it succeeds in finding an implementation. This was possible primarily as a result 
of the development of the formal basis for the synthesis method. 

Finally, but most importantly, let us address the question that any work on program 
synthesis has to confront: How far does the work go towards making the programmers task 
superfluous ? The practical utility of a work in program synthesis can be determined by 
evaluating the following aspects of the synthesis procedure: Efficiency of the synthesis 
method, efficiency of the implementations derived, and the ease of writing specifications. 

The main source of inefficiency in the synthesis procedure stems from the 
non-uniquely terminating nature of expansion. This forces (as shown in section 4.5) the 
procedure to keep track of all possible expansion paths. The implementation of the 
procedure given in section 4.5 uses only the most obvious ways in which unproductive paths 
can be pruned. There are several avenues for further research in this area. One can 
investigate the use of various heuristic approaches for cutting down unproductive paths. 
Another possibility is to make better use of the invariant information available in the 
association specification. The procedure (see chapter 5) currently uses it just as one of the 
conditions to terminate the theorem generation activity. A better utilization of it would be to 
guide the theorem generation activity. For instance, it would be more useful if it were 
possible to deduce from the invariant specification certain structural properties of expressions 
that prevented them from satisfying the invariant This could then be used to discontinue 
unproductive expansion paths during theorem generation. It is hard to extract this kind of 
information from an algebraic specification of 5. It would be interesting to consider other 
means of specifying 3 which can help this cause. 

The synthesis procedure currently does not take into consideration the efficiency of 
its output in synthesizing an implementation. It derives the implementations that it is capable 
of deriving in increasing order of complexity (in terms of the number of reduction steps 
needed) of the proof of the implementations. The are two reasons for this. Firstly, we know 
of no good ways of specifying performance constraints for operations of data types within an 
algebraic framework. Secondly, it was beyond the scope of the current work to incorporate 
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automatic performance analysis of the implementations. There is some recent work being 
done in this area in [50] that is compatible with algebraic theory of data types. It would be 
interesting to investigate the interaction between our work and that of [50]. 

The main reason for choosing an equational language to express the inputs was 
because of the benefits it offers from a proof theoretical point of view. Equational 
specifications have generally been found hard to write. This is one of the factors that reduces 
the practical value of the procedure. It would be useful to extend the synthesis procedure to 
accept specifications in a language that is easier to write. 

We believe that the goal of the research in program synthesis (and program 
verification) should not and cannot be to relieve the programmer completely of the burden of 
programming. Rather, it should be to help us gain a better insight into the science of 
programming. The insight gained can be utilized in several ways that are practically relevant, 
such as in the design of new programming languages, and in the development of program 
maintaining and program development [19, 49, 2, 3] systems. We believe that our work can 
be particularly useful in the latter area. 
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Appendix I - Equations as Rewrite Rules 



Automatic verification of data types that are specified equationally is often based on treating the 
equations in the specifications as rules for rewriting expressions that have certain patterns. The 
automation of our synthesis method also relies on such a treatment of the specifications. This appendix 
describes the basic concepts about rewrite rules, and some useful properties of sets of rewrite rules. 

We assume a denumerable set (¥) of elements called variables, and a finite set 2 of function symbols. 
We define expressions and constants over 2 as follows. (The formal definition is similar to the 
informal one given back in sec.3.3.1.) 

Expressions 

An expression is either (1) a variable, or (2) a function symbol f followed by a sequence of n > 
expressions e„ . . . , e n . f is called the (main) function of this expression, and e lf • • • » e a are called the 
arguments. Such an expression is written f(e p . . . , e n ). An expression with no arguments is written 
as f( ). We denote the set of expressions defined over 2 as E(2). 

We assume it is possible to test variables and function symbols for equality. Two expressions a and fi 
are regarded as identically equal (written a = fi) if and only if they are both the same variable or they 
have the same main function symbol and the same number of identically equal arguments, in the same 
order. 

The variable set of an expression a is {a} if a is a variable, otherwise is the union of the variable sets 
of the arguments of a. 

The subexpressions of an expression are (1) the entire expression, and (2) the subexpressions of the 
arguments (if any) of the expression. Expressions which are variables have no expressions other than 
themselves. 

Constants 

A constant is an expression that does not contain any variables. We denote the set of constants over 2 
as T(2). The subconstants of a constant are (1) the entire constant, and (2) the subconstants of the 
arguments (if any) of the constant 

Occurrences 

An expression can be represented naturally as a tree structure: The main function symbol of the 
expression is the root of the tree; the arguments of the expression are the branches of the tree. This 
analogy can be used to devise a notation to identify unambiguously the subexpressions of an 
expression. 

An occurrence in an expression is a sequence (possibly empty) of positive integers that denotes the 
path inside the tree corresponding to the expression that runs from the root of the tree to the root of 
the tree corresponding to one of the subexpressions. We denote the set of all occurrences in an 
expression e by 0(e). We use the following notation for denoting an occurrence: X is the empty 
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occurrence, and if u is an occurrence and i is an integer, then i.tt is the occurrence that has i at its head 
and u as its tail. 

The subexpression of an expression e at the occurrence u, denoted by e/u, is defined as follows: 
Ifu= A, thene/X =e 
Ifu = i.w (1 < i < n), and e = f(e t , . . . , e n ), then e/u = e/w 

For example, suppose e = Enqueue(Dequeue(Nullq( )), i). Then e/1 = Dequeue(NulIq 0), 
e/2 = i,e/l.l = Nullq(). 

Suppose u is an occurrence of e. Then, we use the notation e[u «— e'] to denote the expression 
obtained by replacing in e the subexpression e/tf by e ' . For instance, suppose e is the same expression 
as in the example given above, and e '" = Nullq( ), then e[l ♦- e ' ] is Enqueuc(Nullq( ), i). 

Substitutions 

Let a be a mapping from variables to expressions, such that a(v) = v for all but a finite number of 
variables v. Extend the domain of a to the set of all expressions by defining a{Ue v . . . , e n )) to be 
f(<x(ej), . . . , a(e n )). Such a mapping a is called a substitution (of expressions for variables). The 
notation a — [Vj i-» e lt . . . , v n t-+ e J will be used to denote the substitution a such that a(Vj) = ej, 
for 1 < i < n, and ct(v) = V. 

We say that an expression fi has the form of an expression a if there exists a substitution a such that 
a(a) = fi. For example, Append(NullqO, Enqueue(q, i)) has the form of 
Append(qt, Enqueue(q2, i2)) by the substitution a — [ql *-* NullqO, q2 ►-♦ q, i2 >-» ij. Notice that 
has the form of is not a symmetric relation. 

Rewrite Rules 

A rewrite rule is an ordered pair of expressions (L, R), such that the variable set of R is contained in 
the variable set of L. Usually (L, R) will be written L -* R. A finite set of rewrite rules over a set of 
function symbols 2 is called a rewriting system over 2. Let R be such a rewriting system. 

An expression a is reducible with respect to R if there is a rule L -♦ R in R, and an occurrence u of o 
such that a/u has the form of L. Let a be a substitution such that <r(L) = a/u, and 
P = a[u <— o(R)]. Then we say that o directly reduces to /J (using R), and write it as o — ► fi (using 
R). Where the particular R in use is clear from the context, this will be written simply as o — ♦ /J. If a 
is not reducible with respect to R, then we say o is irreducible with respect to R. 

Let — ♦* be the smallest relation on pairs of expressions which is the reflexive, transitive closure of -+. 
Thus, a -»* fi if and only if there exist expressions a Q ,a v ..., o B , where n > 0, such that a « a fl , 

«j-*a i+1 fori = 0, ...,n-lando B = fi. We read a -** fi as a reduces to fi. 

Suppose a — »'* /?, and fi is irreducible. Then we say that a simplifies to fi; fi is called a normal form 
of a. We denote the normal form of e as ei. A rewriting system R has the unique termination 
property (UTP) if the simplifies relation defined by R is a ftmction; that is, every expression has at 
most one normal form in R. - 

A rewriting system R has the finite termination property (FTP) if there is no infinite sequence 
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a n — » a. -+ ... using R. 



*U "1 



A rewriting system R is convergent if it has FTP as well as UTP. In such a case, every expression in 
the system has exactly one normal form. 
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Appendix II ■ Checking Finite Termination 

A general technique for proving termination of a rewriting system R with an alphabet 2 is to 
demonstrate that it is possible to define a well-founded partial ordering >- R on T(2) so that tj — ► t 2 
implies tj >- R t 2 . A partial ordering is well-founded if there are no infinite descending sequences such 
as tj >- R t 2 >- R ... for any constants. Hence, there cannot be any infinite sequence of rewrites using R 
also. The following theorem [Manna&Ness] provides a useful guideline to define a suitable partial 
ordering to prove FTP. 

Theorem 3 A rewriting system R with an alphabet 2 satisfies FTP if there exists a well-founded partial 
ordering >- R on T(2) with the properties given below. We call -a well-founded partial ordering that 

satisfies the following properties a termination ordering for the system R since the ordering can be 
used to show the termination of R. 

(1) Reduction: For every rule L -+ R in R, and for every substitution a of variables to 
constants, o(L) >- R a(R). 

(2) Substitution: t >~ R t' implies f(...t...) >^ R f(...t f ...) for any constants t, t\ f(...t...), f(...t' ...) 
in T(2). 



The reduction condition asserts that applying any rule reduces the subterm to which the rule is applied 
in the well-founded ordering. The substitution condition guarantees that by reducing subterms the 
top-level constant is also reduced. Hence it follows that t — f t implies t >- R t . 

Fig. 20 gives a definition of a class of orderings called the lexicographic recursive path orderings (>-). 
>- is parameterized with respect to an ordering (>) on the alphabet of a rewriting system. In addition 
to the substitution property mentioned in the above theorem, >- also contains the subterm relation: tj 

is a subterm of tj implies that t 2 >- tj. Such an ordering is usually referred to as a simplification 



Fig. 20. The Lexicographic Recursive Path Ordering 

Let > be an ordering on an alphabet 2. Then >— on 
T(Z) is defined as follows: 

s >- t iff one of the following conditions is true 
(l)f>gAs>-t.,l<i<n 

(2) f = g A <s p . . . , s n > ^^^ <t r . . • , t B > A s >- t r 1 < i < n 
OKBSjMs^tVSjXt] 

^"^"lex k a r ^ lt to k** lexicographic ordering based on >~. It is defined as follows. 

<s r . . . , s n > ^-^ <t r . . . , t B > iff 

(3 1 < i < n)[Sj >- 1. A (V i<j < n)^ * tpj 
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ordering [Dcrshowitz]. (A proof thai* >- is a simplification ordering can be found in [Kamin].) 
Dershowitz in [Dershowitz] has shown the following theorem: 

Theorem 4 A lexicographic recursive path ordering (>-) is well-founded if and only if the underlying 
alphabet ordering (>) is well-founded. 



One can, in general, use any suitable well-founded alphabet ordering in conjunction with a 
lexicographic recursive path ordering to use it as a termination ordering for a rewriting system. 
Figures 21, and 22 give two alphabet orderings: The first can be used for an arbitrary data type 
specification, and the second for an arbitrary homomorphism specification. We refer to these 
orderings as the standard alphabet orderings for a data type specification, or a homomorphism 
specification, respectively. The orderings are based on a general method of structuring of the alphabets 
of a data type specification and a homomorphism specification. Assuming that there is no circularity in 
the defining_types relation on data types, it can be easily shown that the standard alphabet orderings 
are well-founded orderings. 

A lexicographic recursive path ordering based on an alphabet ordering of Fig. 21 can serve as a 
termination ordering for the rewriting systems corresponding to Queue_Int and Circ_List. We leave 
it to the reader to convince for himself that >- satisfies the reduction property in each of the two cases; 



Fig. 21. The Standard Alphabet Ordering for a Data Type Rewriting System 

Notations 

S is the rewriting system corresponding to TOI 

2 is the alphabet of S 

fl is the operation set of TOI 

Q B is the set of generators of S 

Qj^g is the set of nongenerators of S 

Sjj^ is the union of the alphabets of the rewriting systems of the defining types 

(We assume that the alphabets are mutually exclusive.) 

> is a partial ordering on the symbols in 2 

Definitions 

2 = Q B UQ NB U2 Def 

a = Q B U0 NB 

> is defined as follows. It is assumed that a similarly defined ordering exists for each of the alphabets 
in 2^ > is assumed to contain each of these orderings. 

f > g iff one of the following conditions is true 
(1) f, g € Og A arity of g = 0, arity of f > 

(2)f€Q NB Ag€Q B 

(3)f€QAg€2 Def 
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Fig. 22. The Standard Alphabet Ordering for Homomorphism Specification 

Notations 

2 H is the alphabet of the homomorphism specification 

> H is the standard alphabet ordering on 2 H 

Definition 

f > H g if and only if one of the following conditions holds: 

(1) f is the symbol %, and g is any other function symbol in 2 H 

(2) f is an auxiliary function symbol, and g is a generator function symbol 



one needs to use the fact that >- contains the subterm relation in doing so. The ordering cannot, 
however, be used for Array Jnt specification. The ordering can be used for a subset of the 
specification that is used in examples to illustrate the synthesis procedure. A lexicographic recursive 
path ordering based on the standard alphabet ordering of Fig. 22 can be used all of the sample 
homomorphism specifications given in the last chapter. 

Lexicographic recursive path orderings are useful in defining termination ordering for a rewriting 
system that is built from two or more rewriting systems that have recursive path orderings already 
defined on them. Suppose >- l and >~ 2 are two recursive path orderings defined with respect to the 

well-founded alphabet orderings >j and > 2 , respectively. Suppose R|, and R 2 are two systems for 
which >-! and >- 2 can serve as termination orderings. Then the recursive path ordering that is based 
on >j U > 2 can be used as a termination ordering for the system Rj U R 2 provided >j U > 2 is 

well-founded. The standard alphabet ordering is such that the union of any two of them (defined on 
mutually exclusive alphabets) preserves the well-foundedness property. Hence it is useful in the 
context of combining systems of rewriting systems. 
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Appendix HI ■ Proofs of Theorems 

Theorem 6 

Let S be a system that satisfies the principle of definition. Let e x = e 2 be an equation so that e 2 and e 2 

have at least one nongenerator function symbol in them. Then, e 1 = e 2 is a theorem of S if 
S U {e, -» e 2 } also satisfies the principle of definition. 

Proof The proof is by contradiction. Let us assume that SUfej-t e 2 } satisfies the principle of 
definition, but e t = e 2 is not a theorem of S. 

If e l = e 2 is not a theorem of S, then there exists a substitution a that maps variables to 
generator constants so that afej) and a(e 2 ) have distinct normal forms in S. Since S satisfies the 
principle of definition, oie^ and <r(e 2 ) have unique normal forms that are generator constants; let the 
normal forms be tj and t 2 , respectively (tj ^ t 2 ). Note that o(e 2 ) and sie2 are distinct from tj and tj, 
respectively because the latter two are generator constants while the former two are not. Therefore, in 
the system S U {ej — ► e 2 } we have the following situation: 

(Kej) -+ a(e 2 ) -+ + t 2 , a(e 1 )-* + t v and tj * t 2 . 

Thus, S U {e l -+ e 2 } violates the principle of definition. Contradiction. 

Q.RD. 

Theorem 7 

PW is a Perturbed World. Suppose 

(1) ej is an expression so that for every substitution a of variables to generator constants a (ej) 
is reducible using PW, and 

(2) PW U {e x .-» e 2 } is convergent 

Then, ^ = e 2 is a theorm of PW. 

Proof PW is convergent. Therefore, to show that e x = e 2 is a theorem of PW, we have to show that 
for every substitution a of the variables in ej and e 2 by generator terms of the appropriate type, a (ej) 
and <r(e 2 ) have the same normal forms. 

The proof is by contradiction. Let us suppose that PW U {e t — ► e 2 } is convergent, but 
ej = e 2 is not a theorem of PW. This means, there exists a a such that tj = oiejl and t 2 = cr(e 2 )l 
are distinct By the second premise of the theorem, therefore, we have the following situation in PW U 
{e!-*e 2 } 

a(e 1 )^a(e 2 )-^ + t 2 
ff(ej)- f+ t 1 andt 1 £ tj. 

Therefore, PW U {ej -* e 2 } is not convergent Notice the need for the second premise. If 
we did not have this premise a(ej) could be identical to tj, in which case PW U {ej -+ e 2 } is still 
convergent 
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Q.E.D. 



Theorem 8 

A rewriting system R satisfies the principle of definition if it satisfies the following conditions: 

(1) R is well-spanned. 

(2) R has FTP. 

(3) Every critical pair <a v a 2 > of R is such that a t =a 2 is a theorem of R. 

Proof The first two conditions ensure that every constant in R has at least one normal form, and that 
every normal form is a generator constant. The following argument shows that every constant has a 
unique normal form. 

The proof is by contradiction. Suppose there exists a constant that has two distinct normal form. 
Then, according to the KB-thcorem there exists a nonenvergent critical pair. This contradicts the third 
condition in the statement of the theorem. Contradiction. 

Q.E.D. 



SECURITY CLASSIFICATION OF THIS PAGE (•»«• Of Knlorod) 



REPORT DOCUMENTATIO N PACE 

I. REPORT NUMBER 

MIT/LCS/TR-276 



«.-.«» 



i. title (•■* auMiim) Automatic Synthesis of Implementa- 
tions for Abstract Data Types from Algebraic 
Specifications . 



t. AUTHORS 

Mandayam K. 



Srivas 



S. PERFORMING ORGANIZATION NAME AND ADDRESS 

MIT Lab for Computer Science 
545 Technology Square 
Cambridge, Ma. 02139 



It. CONTROLLING OFFICE NAME AN ADDRESS 
DARPA 

1400 Wilson Boulevard 

Arlingt on, Virginia 22217 

U MONITORING AGENCY NAME 4 AODRESSfif *«*»• tram Contraltfa* OIHeo) 

Office of Naval Research 
Department of the Hevy 
Information Systems Program 
Arlingto n, Virginia 22217 , 

is. DISTRIBUTION STATEMENT (ol mlo ftapart.) 

Unlimited 



READ WSTRUCTIONS 
BEFORE COMPLETING FORM 



S. TYRE OF REPORT a PERIOD COVERED 

Technical Report June '82 



S. PERFORMING ORG. REPORT NUMGCR 

MIT/LCS/TR-276 

I. CONTRACf OR GRANT NUMBERS 

DARPA NO0014-75-C-O661 



16. PRQORAM ELEMENT. PROJECT. TASK 
AREA * WORK UNIT NUMBERS 



IS. REPORT DATE 

June 1982 



IS. NUMBER OF PAGES 

161 



s *tfc, SECURITY CLASS, (ol (Ma i 

Unclassified 

" Wk gfe\.ASStFIC ATION/OOWNORAOINO 



17. DISTRIBUTION STAT«MENT (ol mo ■»«» — 1 — — * m Btomk JG, H G W — « < »— *mpoti) 

This document is approved for public release and sale, distribution u nl imited 



IB. SUPPLEMENTARY NOTES 



It. KEY WORDS (Conttmio on nnrn olow II momooomr *** Immtttr or mloek mm**) 



See back 



10. ABSTRACT (Contlnum an ravaraa •!*• It motmmomy mo* ISontlty my Mae* tmmoot) 

See back 



DO /jSTn 1473 



COITION OF I NOV GS IS OBSOLETE 



SECURITY CLASSIFICATION OF THIS PAGE (m%mm *■*• B*toro4> 



—cuwtv ctAMiyiCATiow or tww »»— mm 



Automatic Synthesis of 

Implementations for Abstract Data Types 

from Algebraic Specifications 



Abstract 



hive been i 

or 



to prow pTCocc^ of e ba wari data types 
of data types, Tfcli thcak explores an 
ford* 



TkckMBloi 




procedure oonnut of a ajiecincatioii for Ac 
of the huph.nu mint, typo, and a formal 
lo be uaad by foe laiplnmrniatto». Hie output of 
for audi of foe operations of the 



t»P«.« 
of foe 



«***• 



of foe ajufoeak praoeduie arc prcdasly 
by foe procedure k developed. The 
MfoMBHC of provbig foe correctness of an 

01 foe inputs, vmI foe conditions wider v4iich 
areformaly 



JotmV.Gottai 

Proftenr of Computer Schmee 



Al 
•do* foe 
ofsduta 



Abetna Da* Type. Abjebnk SpsdtkarJoa. 

SpecMcation. Abeuaota Function, 
hnawcnten te tj o it, Tnnjni 

ilUwennanCj 



ku 



efa 

hi 



orfotmawtUe 
Ml hi 



k> foe 



"•'"'TV r->. mrxi-nam *>« -tr^tm »»<»»(—,., n.i ■ 



